Get in touch
Email me at adityaacodes01@gmail.com adityaacodes01@gmail.com link
This script helps to automate the process of preparing data for finetuning on OpenAI models, specifically GPT-3.5 and Babbage. It also provides utilities to validate the data, transform the data to the required JSONL format, and estimate the cost of the finetuning process.
pyfiglet, openai, tiktoken, dotenv, argparse, json, re, os, sys, time, clintTo install the required libraries:
pip install pyfiglet openai tiktoken python-dotenv argparse clint
or
pip install requirements.txt
python ftup.py [-k <API_KEY>] -m <MODEL_NAME> -f <INPUT_FILE> [-s <SUFFIX>] [-e <EPOCHS>]
Arguments:
-k, --key: Optional. API key. Optional argument, but required in default env to have an API key in enviroment. OPENAI_API_KEY-m, --model: Required. Model to use. Options: gpt for gpt-3.5-turbo-0613 or bab for babbage-002.-f, --file: Required. Input data file (JSONL format).-s, --suffix: Optional. Add a suffix for your finetuned model. E.g., ‘my-suffix-title-v-1’.-e, --epoch: Optional. Number of epochs for training. Default is 3.Store your API key in a .env file in the format:
OPENAI_API_KEY=your_api_key_here
The script will load by default this key if not -k / --key passed as an argument.
check_key(key): Validates format for OpenAI API key.check_model(model): Validates the model name.check_jsonl_file(file): Checks if the provided file has a valid JSONL name and if it exists.create_update_jsonl_file(model, file): Check if JSONL have a correct format and uploads file to OpenAI.update_ft_job(file_id_name, model, suffix, epoch): Creates or updates the finetuning job on OpenAI.check_jsonl_gpt35(file): Validates the format for GPT-3.5 training.check_jsonl_babbage(file): Validates the format for Babbage-002 training.cost_gpt(file, epochs): Estimates the cost of the finetuning process.Email me at adityaacodes01@gmail.com adityaacodes01@gmail.com link