utils
Compute the one-hot encoding of a protein sequence. |
|
Tokenizes the sequence. |
|
Sinusoidal encoding of sequence position. |
|
Composes multiple embeddings into one by concatenating the results. |
|
Saves an object to either pickle, json, or json.gz (determined by the extension in the file name). |
|
Loads a pickle, json or json.gz file. |
|
Downloads a file from an url. |
|
Extracts a tar file. |
|
Zips a file. |
|
Unzips a .gz file. |
|
Writes a list of protein dictionaries to an avro file. |
|
|
|
|
- onehot(sequence, resolution='residue')[source]
Compute the one-hot encoding of a protein sequence.
- tokenize(sequence, resolution='residue')[source]
Tokenizes the sequence.
- positional_encoding(sequence, dim=128)[source]
Sinusoidal encoding of sequence position.
- Parameters:
sequence (str) – The protein sequence
- Returns:
The embedded sequence.
- Return type:
ndarray
- compose_embeddings(embeddings)[source]
Composes multiple embeddings into one by concatenating the results.
- Parameters:
embeddings (list) – A list of embeddings
- Returns:
A substitute embedding function.
- Return type:
function
- save(obj, path)[source]
Saves an object to either pickle, json, or json.gz (determined by the extension in the file name).
- Parameters:
obj – The object to be saved.
path – The path to save the object.
- load(path)[source]
Loads a pickle, json or json.gz file.
- Parameters:
path – The path to be loaded.
- Returns:
The loaded object.
- Return type:
- download_url(url, out_path, verbosity=2, chunk_size=10485760)[source]
Downloads a file from an url. If out_path is a directory, the file will be saved under the url basename.
- extract_tar(tar_path, out_path, extract_members=False, strip=0, verbosity=2)[source]
Extracts a tar file.
- Parameters:
tar_path – The path to the tar file.
out_path – The directory to extract to.
extract_members (bool, default False) – If True, the tar file member will be directly extracted to out_path, instead of creating a subdirectory.
strip (int, default 0) – Remove strip folder hierarchies from the path of the extracted file.
- zip_file(path)[source]
Zips a file.
- Parameters:
path – The path to the file.
- unzip_file(path, remove=True)[source]
Unzips a .gz file.
- Parameters:
path – The path to the .gz file.