Gorilla: LLM Connected with APIs

Gorilla is a large language model (LLM) capable of invoking APIs. It is trained on a wide range of API documentation and can generate the proper API call for a given natural language question, including the correct input parameters. Gorilla is more accurate than prior techniques to API invocation, and it is less likely to hallucinate incorrect API call use.

Gorilla LLM Connected with APIs

Gorilla is a large language model (LLM) Connected with APIs. It is trained on a large amount of API documentation and can construct the proper API call for a given natural language question, including the correct input parameters. Gorilla is more accurate than prior techniques to API invocation, and it is less likely to hallucinate incorrect API call use.

Gorilla is a useful tool for developers who wish to automate operations or construct apps using APIs. Researchers interested in the use of APIs in natural language processing can also utilize it.

How to install Gorilla Language Model

  1. Install Dependencies:
  • Open your terminal or command prompt.
  • To build a new Conda environment called “gorilla” with Python 3.10, use the following command:

conda create -n gorilla python=3.10

  • Activate the “gorilla” environment:

conda activate gorilla

Install the necessary Python packages with the following command, assuming you have a file called “requirements.txt” with the dependencies:

pip install -r requirements.txt

  1. Install Gorilla Delta Weights:
    • Obtain the original LLaMA weights from the provided link.
    • Download the Hugging Face repository’s Gorilla delta weights.
  2. Using Delta Weights:
    • Replace the placeholders in the following Python command with the proper file paths:
See also  10 Midjourney Alternatives That Will Blow Your Mind

python apply_delta.py -base-model-path path/to/hf_llama/ -target-model-path path/to/gorilla-7b-hf-v0 -delta-path path/to/models-gorilla-llm-gorilla-7b-hf-delta-v0

  • The delta weights are applied to your LLaMA model with this command.
  1. Using CLI for Inference:
  • To begin interacting with the Gorilla model using the command-line interface (CLI), use the following command:

python serve/gorilla_cli.py -model-path path/to/gorilla-7b-{hf,th,tf}-v0

  • Path/to/gorilla-7b-hf,th,tf-v0 should be replaced with the real path to the Gorilla model.
  1. Batch Inference on a Prompt File is optional:
  • Make a JSONL file with the queries you want the Gorilla model to answer. Each question should be written in JSON and have a “question_id” and “text” field.
  • Replace the placeholders with the proper file locations and run the following command:

python gorilla_eval.py -model-path path/to/gorilla-7b-hf-v0 -question-file path/to/questions.jsonl -answer-file path/to/answers.jsonl

This program does batch inference on the input file’s questions and saves the generated answers to the output file.

Repository Structure

The repository organization of Gorilla is as follows:

The data folder contains a variety of datasets, including API documentation and the community contributed APIBench dataset.

  • Each file in the api subdirectory represents an API and is entitled {api_name}_api.jsonl.
  • The apibench subfolder includes LLM model training and evaluation datasets. It contains the files {api_name}_train.jsonl and {api_name}_eval.jsonl.
  • APIs supplied by the community may be found in the apizoo subdirectory.

The eval folder includes evaluation code and outputs.

  • The README.md file most likely contains instructions or data regarding the assessment process.
  • To receive replies from the LLM models, use the get_llm_responses.py script.
  • The subdirectory eval-scripts includes evaluation scripts for each API, such as ast_eval_{api_name}.py.
  • The eval-data subdirectory includes evaluation questions and replies.
    • The question files in the questions subfolder are organized by API name and assessment metric.
      • Within the questions subdirectory, each API folder has files titled questions_{api_name}_{eval_metric}.jsonl.
    • Response files are likewise organized in the responses subfolder by API name and assessment metric.
      • Within the replies to subfolder, each API folder contains files entitled responses_{api_name}Gorilla_FT{eval_metric}.jsonl and responses_{api_name}Gorilla_RT{eval_metric}.jsonl.
See also  5 Best Tools to Download YouTube Videos in HD 2023

The inference folder contains code for running Gorilla locally.

  • This folder’s README.md file most likely contains instructions for executing the inference code.
  • The serve subdirectory contains Gorilla command-line interface (CLI) scripts and a chat template.
  • The train folder is tagged “Coming Soon!” and is most likely supposed to include Gorilla model training code. However, it appears that this folder is now unavailable.

You can refer to the README files in each folder for more specific instructions and information on using the provided code and datasets.

Also Read: QLoRA: Efficient Finetuning of Quantized LLMs

Limitations & Social Impacts

They picked ML APIs because of their functional similarities in order to create a tough dataset. The potential disadvantage of ML-focused APIs is their capacity to provide biased predictions when trained on skewed data, perhaps disadvantageous to some sub-groups. To address this concern and promote a better understanding of these APIs, they are publishing a large dataset with over 11,000 instruction-API pairs. This resource will benefit the larger community by serving as a great tool for researching and assessing current APIs, resulting in a more equitable and optimal use of machine learning.

FAQs for Gorilla: LLM Connected with APIs

This article is to help you learn Gorilla: LLM Connected with APIs. We trust that it has been helpful to you. Please feel free to share your thoughts and feedback in the comment section below.