updated paper reference in the LICENSE file (#31)
- updated paper reference in the LICENSE file - improved documentation
This commit is contained in:
parent
2978238318
commit
15fb8e661d
7
LICENSE
7
LICENSE
@ -46,7 +46,8 @@ following citation:
|
|||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Lukas
|
Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Lukas
|
||||||
Gianinazzi, Joanna Gajda, Tomasz Lehmann, Michał Podstawski, Hubert
|
Gianinazzi, Joanna Gajda, Tomasz Lehmann, Michał Podstawski, Hubert
|
||||||
Niewiadomski, Piotr Nyczyk, Torsten Hoefler: Graph of Thoughts: Solving
|
Niewiadomski, Piotr Nyczyk, Torsten Hoefler (2024): Graph of Thoughts:
|
||||||
Elaborate Problems with Large Language Models. In: arXiv preprint
|
Solving Elaborate Problems with Large Language Models. In: Proceedings
|
||||||
arXiv:2308.09687
|
of the AAAI Conference on Artificial Intelligence, 38(16),
|
||||||
|
17682-17690. https://doi.org/10.1609/aaai.v38i16.29720
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|||||||
@ -4,7 +4,7 @@ The Language Models module is responsible for managing the large language models
|
|||||||
|
|
||||||
Currently, the framework supports the following LLMs:
|
Currently, the framework supports the following LLMs:
|
||||||
- GPT-4 / GPT-3.5 (Remote - OpenAI API)
|
- GPT-4 / GPT-3.5 (Remote - OpenAI API)
|
||||||
- Llama-2 (Local - HuggingFace Transformers)
|
- LLaMA-2 (Local - HuggingFace Transformers)
|
||||||
|
|
||||||
The following sections describe how to instantiate individual LLMs and how to add new LLMs to the framework.
|
The following sections describe how to instantiate individual LLMs and how to add new LLMs to the framework.
|
||||||
|
|
||||||
@ -13,50 +13,50 @@ The following sections describe how to instantiate individual LLMs and how to ad
|
|||||||
- Fill configuration details based on the used model (below).
|
- Fill configuration details based on the used model (below).
|
||||||
|
|
||||||
### GPT-4 / GPT-3.5
|
### GPT-4 / GPT-3.5
|
||||||
- Adjust predefined `chatgpt`, `chatgpt4` or create new configuration with an unique key.
|
- Adjust the predefined `chatgpt` or `chatgpt4` configurations or create a new configuration with an unique key.
|
||||||
|
|
||||||
| Key | Value |
|
| Key | Value |
|
||||||
|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| model_id | Model name based on [OpenAI model overview](https://platform.openai.com/docs/models/overview). |
|
| model_id | Model name based on [OpenAI model overview](https://platform.openai.com/docs/models/overview). |
|
||||||
| prompt_token_cost | Price per 1000 prompt tokens based on [OpenAI pricing](https://openai.com/pricing), used for calculating cumulative price per LLM instance. |
|
| prompt_token_cost | Price per 1000 prompt tokens based on [OpenAI pricing](https://openai.com/pricing), used for calculating cumulative price per LLM instance. |
|
||||||
| response_token_cost | Price per 1000 response tokens based on [OpenAI pricing](https://openai.com/pricing), used for calculating cumulative price per LLM instance. |
|
| response_token_cost | Price per 1000 response tokens based on [OpenAI pricing](https://openai.com/pricing), used for calculating cumulative price per LLM instance. |
|
||||||
| temperature | Parameter of OpenAI models that controls randomness and the creativity of the responses (higher temperature = more diverse and unexpected responses). Value between 0.0 and 2.0, default is 1.0. More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/completions/create#completions/create-temperature). |
|
| temperature | Parameter of OpenAI models that controls the randomness and the creativity of the responses (higher temperature = more diverse and unexpected responses). Value between 0.0 and 2.0, default is 1.0. More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/completions/create#completions/create-temperature). |
|
||||||
| max_tokens | The maximum number of tokens to generate in the chat completion. Value depends on the maximum context size of the model specified in the [OpenAI model overview](https://platform.openai.com/docs/models/overview). More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens). |
|
| max_tokens | The maximum number of tokens to generate in the chat completion. Value depends on the maximum context size of the model specified in the [OpenAI model overview](https://platform.openai.com/docs/models/overview). More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens). |
|
||||||
| stop | String or array of strings specifying sequence of characters which if detected, stops further generation of tokens. More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat/create-stop). |
|
| stop | String or array of strings specifying sequences of characters which if detected, stops further generation of tokens. More information can be found in the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat/create-stop). |
|
||||||
| organization | Organization to use for the API requests (may be empty). |
|
| organization | Organization to use for the API requests (may be empty). |
|
||||||
| api_key | Personal API key that will be used to access OpenAI API. |
|
| api_key | Personal API key that will be used to access OpenAI API. |
|
||||||
|
|
||||||
- Instantiate the language model based on the selected configuration key (predefined / custom).
|
- Instantiate the language model based on the selected configuration key (predefined / custom).
|
||||||
```
|
```python
|
||||||
lm = controller.ChatGPT(
|
lm = controller.ChatGPT(
|
||||||
"path/to/config.json",
|
"path/to/config.json",
|
||||||
model_name=<configuration key>
|
model_name=<configuration key>
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Llama-2
|
### LLaMA-2
|
||||||
- Requires local hardware to run inference and a HuggingFace account.
|
- Requires local hardware to run inference and a HuggingFace account.
|
||||||
- Adjust predefined `llama7b-hf`, `llama13b-hf`, `llama70b-hf` or create a new configuration with an unique key.
|
- Adjust the predefined `llama7b-hf`, `llama13b-hf` or `llama70b-hf` configurations or create a new configuration with an unique key.
|
||||||
|
|
||||||
| Key | Value |
|
| Key | Value |
|
||||||
|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| model_id | Specifies HuggingFace Llama 2 model identifier (`meta-llama/<model_id>`). |
|
| model_id | Specifies HuggingFace LLaMA-2 model identifier (`meta-llama/<model_id>`). |
|
||||||
| cache_dir | Local directory where model will be downloaded and accessed. |
|
| cache_dir | Local directory where the model will be downloaded and accessed. |
|
||||||
| prompt_token_cost | Price per 1000 prompt tokens (currently not used - local model = no cost). |
|
| prompt_token_cost | Price per 1000 prompt tokens (currently not used - local model = no cost). |
|
||||||
| response_token_cost | Price per 1000 response tokens (currently not used - local model = no cost). |
|
| response_token_cost | Price per 1000 response tokens (currently not used - local model = no cost). |
|
||||||
| temperature | Parameter that controls randomness and the creativity of the responses (higher temperature = more diverse and unexpected responses). Value between 0.0 and 1.0, default is 0.6. |
|
| temperature | Parameter that controls the randomness and the creativity of the responses (higher temperature = more diverse and unexpected responses). Value between 0.0 and 1.0, default is 0.6. |
|
||||||
| top_k | Top-K sampling method described in [Transformers tutorial](https://huggingface.co/blog/how-to-generate). Default value is set to 10. |
|
| top_k | Top-K sampling method described in [Transformers tutorial](https://huggingface.co/blog/how-to-generate). Default value is set to 10. |
|
||||||
| max_tokens | The maximum number of tokens to generate in the chat completion. More tokens require more memory. |
|
| max_tokens | The maximum number of tokens to generate in the chat completion. More tokens require more memory. |
|
||||||
|
|
||||||
- Instantiate the language model based on the selected configuration key (predefined / custom).
|
- Instantiate the language model based on the selected configuration key (predefined / custom).
|
||||||
```
|
```python
|
||||||
lm = controller.Llama2HF(
|
lm = controller.Llama2HF(
|
||||||
"path/to/config.json",
|
"path/to/config.json",
|
||||||
model_name=<configuration key>
|
model_name=<configuration key>
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
- Request access to Llama-2 via the [Meta form](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) using the same email address as for the HuggingFace account.
|
- Request access to LLaMA-2 via the [Meta form](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) using the same email address as for the HuggingFace account.
|
||||||
- After the access is granted, go to [HuggingFace Llama-2 model card](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), log in and accept the license (_"You have been granted access to this model"_ message should appear).
|
- After the access is granted, go to [HuggingFace LLaMA-2 model card](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), log in and accept the license (a _"You have been granted access to this model"_ message should appear).
|
||||||
- Generate HuggingFace access token.
|
- Generate HuggingFace access token.
|
||||||
- Log in from CLI with: `huggingface-cli login --token <your token>`.
|
- Log in from CLI with: `huggingface-cli login --token <your token>`.
|
||||||
|
|
||||||
@ -64,9 +64,9 @@ Note: 4-bit quantization is used to reduce the model size for inference. During
|
|||||||
|
|
||||||
## Adding LLMs
|
## Adding LLMs
|
||||||
More LLMs can be added by following these steps:
|
More LLMs can be added by following these steps:
|
||||||
- Create new class as a subclass of `AbstractLanguageModel`.
|
- Create a new class as a subclass of `AbstractLanguageModel`.
|
||||||
- Use the constructor for loading configuration and instantiating the language model (if needed).
|
- Use the constructor for loading the configuration and instantiating the language model (if needed).
|
||||||
```
|
```python
|
||||||
class CustomLanguageModel(AbstractLanguageModel):
|
class CustomLanguageModel(AbstractLanguageModel):
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
@ -81,15 +81,15 @@ class CustomLanguageModel(AbstractLanguageModel):
|
|||||||
|
|
||||||
# Instantiate LLM if needed
|
# Instantiate LLM if needed
|
||||||
```
|
```
|
||||||
- Implement `query` abstract method that is used to get a list of responses from the LLM (call to remote API or local model inference).
|
- Implement the `query` abstract method that is used to get a list of responses from the LLM (remote API call or local model inference).
|
||||||
```
|
```python
|
||||||
def query(self, query: str, num_responses: int = 1) -> Any:
|
def query(self, query: str, num_responses: int = 1) -> Any:
|
||||||
# Support caching
|
# Support caching
|
||||||
# Call LLM and retrieve list of responses - based on num_responses
|
# Call LLM and retrieve list of responses - based on num_responses
|
||||||
# Return LLM response structure (not only raw strings)
|
# Return LLM response structure (not only raw strings)
|
||||||
```
|
```
|
||||||
- Implement `get_response_texts` abstract method that is used to get a list of raw texts from the LLM response structure produced by `query`.
|
- Implement the `get_response_texts` abstract method that is used to get a list of raw texts from the LLM response structure produced by `query`.
|
||||||
```
|
```python
|
||||||
def get_response_texts(self, query_response: Union[List[Dict], Dict]) -> List[str]:
|
def get_response_texts(self, query_response: Union[List[Any], Any]) -> List[str]:
|
||||||
# Retrieve list of raw strings from the LLM response structure
|
# Retrieve list of raw strings from the LLM response structure
|
||||||
```
|
```
|
||||||
|
|||||||
@ -7,6 +7,6 @@ The poster presented at the 2024 Association for the Advancement of Artificial I
|
|||||||
|
|
||||||
## Plot Data
|
## Plot Data
|
||||||
|
|
||||||
The data used to create the figure of the arXiv preprint article can be
|
The data used to create the figures of the paper can be
|
||||||
found in the `final_results_gpt35.tar.bz2` archive. Unpack the archive
|
found in the `final_results_gpt35.tar.bz2` archive. Unpack the archive
|
||||||
and run the file `plots.py`.
|
and run the file `plots.py`.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user