About ailia SDK

The collection of pre-trained, state-of-the-art AI models.

About ailia SDK

ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing.

How to use

Try now on Google Colaboratory

If you would like to try on your computer:

ailia MODELS tutorial

ailia MODELS tutorial 日本語版

Supported models

358 models as of October 9th, 2024

Latest update

2024.10.09 Add whisper-v3-turbo
2024.10.02 Add florence2
2024.09.15 Add bert-vits2, pytorch_wavenet
2024.09.12 Add gpt-sovits-v2
2024.09.10 Add segment-anything-2 (video mode)
2024.08.27 Add segment-anything-2 (image mode)
2024.08.20 Add bert_ner_japanese
2024.08.16 Add latent-consistency-model-txt2img, fbcnn
2024.08.15 Add volo, elegant, depth_anything, drbn_skf, codeformer, dtln
2024.08.10 Add TripoSR, japanese-reranker-cross-encoder
2024.08.09 Add mahalanobis-ad, t5_base_japanese_ner
2024.08.08 Add sdxl-turbo, sd-turbo
2024.08.05 Migrate to ailia Tokenizer 1.3 from Transformers
More information in our Wiki

Action recognition

Model	Reference	Exported From	Supported Ailia Version	Blog
mars	MARS: Motion-Augmented RGB Stream for Action Recognition	Pytorch	1.2.4 and later	EN JP
st-gcn	ST-GCN	Pytorch	1.2.5 and later	EN JP
ax_action_recognition	Realtime-Action-Recognition	Pytorch	1.2.7 and later
va-cnn	View Adaptive Neural Networks (VA) for Skeleton-based Human Action Recognition	Pytorch	1.2.7 and later
driver-action-recognition-adas	driver-action-recognition-adas-0002	OpenVINO	1.2.5 and later
action_clip	ActionCLIP	Pytorch	1.2.7 and later

Anomaly detection

Model	Reference	Exported From	Supported Ailia Version	Date	Blog
mahalanobisad	MahalanobisAD-pytorch	Pytorch	1.2.9 and later	May 2020
spade-pytorch	Sub-Image Anomaly Detection with Deep Pyramid Correspondences	Pytorch	1.2.6 and later	May 2020
padim	PaDiM-Anomaly-Detection-Localization-master	Pytorch	1.2.6 and later	Nov 2020	EN JP
patchcore	PatchCore_anomaly_detection	Pytorch	1.2.6 and later	Jun 2021

Audio processing

Audio classification

Model	Reference	Exported From	Supported Ailia Version	Blog
crnn_audio_classification	crnn-audio-classification	Pytorch	1.2.5 and later	EN JP
transformer-cnn-emotion-recognition	Combining Spatial and Temporal Feature Representions of Speech Emotion by Parallelizing CNNs and Transformer-Encoders	Pytorch	1.2.5 and later
audioset_tagging_cnn	PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition	Pytorch	1.2.9 and later
clap	CLAP	Pytorch	1.2.6 and later
microsoft clap	CLAP	Pytorch	1.2.11 and later

Music enhancement

Model	Reference	Exported From	Supported Ailia Version	Blog
hifigan	HiFi-GAN	Pytorch	1.2.9 and later
deep music enhancer	On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks	Pytorch	1.2.6 and later

Music generation

Model	Reference	Exported From	Supported Ailia Version	Blog
pytorch_wavenet	pytorch_wavenet	Pytorch	1.2.14 and later

Noise reduction

Model	Reference	Exported From	Supported Ailia Version	Blog
unet_source_separation	source_separation	Pytorch	1.2.6 and later	EN JP
voicefilter	VoiceFilter	Pytorch	1.2.7 and later	EN JP
rnnoise	rnnoise	Keras	1.2.15 and later
dtln	Dual-signal Transformation LSTM Network	Tensorflow	1.3.0 and later

Phoneme alignment

Model	Reference	Exported From	Supported Ailia Version	Blog
narabas	narabas: Japanese phoneme forced alignment tool	Pytorch	1.2.11 and later

Pitch detection

Model	Reference	Exported From	Supported Ailia Version	Blog
crepe	torchcrepe	Pytorch	1.2.10 and later	JP

Speaker diarization

Model	Reference	Exported From	Supported Ailia Version	Blog
auto_speech	AutoSpeech: Neural Architecture Search for Speaker Recognition	Pytorch	1.2.5 and later	EN JP
wespeaker	WeSpeaker	Onnxruntime	1.2.9 and later
pyannote-audio	Pyannote-audio	Pytorch	1.2.15 and later	JP

Speech to text

Model	Reference	Exported From	Supported Ailia Version	Date	Blog
deepspeech2	deepspeech.pytorch	Pytorch	1.2.2 and later	Oct 2017	EN JP
whisper	Whisper	Pytorch	1.2.10 and later	Dec 2022	JP
reazon_speech	ReazonSpeech	Pytorch	1.4.0 and later	Jan 2023
distil-whisper	Hugging Face - Distil-Whisper	Pytorch	1.2.16 and later	Nov 2023
reazon_speech2	ReazonSpeech2	Pytorch	1.4.0 and later	Feb 2024
kotoba-whisper	kotoba-whisper	Pytorch	1.2.16 and later	Apr 2024

Text to speech

Model	Reference	Exported From	Supported Ailia Version	Date	Blog
pytorch-dc-tts	Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention	Pytorch	1.2.6 and later	Oct 2017	EN JP
tacotron2	Tacotron2	Pytorch	1.2.15 and later	Feb 2018	JP
vall-e-x	VALL-E-X	Pytorch	1.2.15 and later	Mar 2023	JP
Bert-VITS2	Bert-VITS2	Pytorch	1.2.16 and later	Aug 2023
gpt-sovits	GPT-SoVITS	Pytorch	1.4.0 and later	Feb 2024	JP
gpt-sovits-v2	GPT-SoVITS	Pytorch	1.4.0 and later	Aug 2024

Voice activity detection

Model	Reference	Exported From	Supported Ailia Version	Blog
silero-vad	Silero VAD	Pytorch	1.2.15 and later	JP

Voice conversion

Model	Reference	Exported From	Supported Ailia Version	Blog
rvc	Retrieval-based-Voice-Conversion-WebUI	Pytorch	1.2.12 and later	JP

Background removal

Model	Reference	Exported From	Supported Ailia Version	Blog
U-2-Net	U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection	Pytorch	1.2.2 and later	EN JP
u2net-portrait-matting	U^2-Net - Portrait matting	Pytorch	1.2.7 and later
u2net-human-seg	U^2-Net - human segmentation	Pytorch	1.2.4 and later
deep-image-matting	Deep Image Matting	Keras	1.2.3 and later	EN JP
indexnet	Indices Matter: Learning to Index for Deep Image Matting	Pytorch	1.2.7 and later
modnet	MODNet: Trimap-Free Portrait Matting in Real Time	Pytorch	1.2.7 and later
background_matting_v2	Real-Time High-Resolution Background Matting	Pytorch	1.2.9 and later
cascade_psp	CascadePSP	Pytorch	1.2.9 and later
rembg	Rembg	Pytorch	1.2.4 and later
dis_seg	Highly Accurate Dichotomous Image Segmentation	Pytorch	1.2.10 and later
gfm	Bridging Composite and Real: Towards End-to-end Deep Image Matting	Pytorch	1.2.10 and later

Crowd counting

	Model	Reference	Exported From	Supported Ailia Version	Blog
	crowdcount-cascaded-mtl	CNN-based Cascaded Multi-task Learning of High-level Prior and Density Estimation for Crowd Counting (Single Image Crowd Counting)	Pytorch	1.2.1 and later	EN JP
	c-3-framework	Crowd Counting Code Framework(C^3-Framework)	Pytorch	1.2.5 and later

Deep fashion

Model	Reference	Exported From	Supported Ailia Version	Blog
clothing-detection	Clothing-Detection	Pytorch	1.2.1 and later	EN JP
mmfashion	MMFashion	Pytorch	1.2.5 and later	EN JP
mmfashion_tryon	MMFashion virtual try-on	Pytorch	1.2.8 and later
mmfashion_retrieval	MMFashion In-Shop Clothes Retrieval	Pytorch	1.2.5 and later
fashionai-key-points-detection	A Pytorch Implementation of Cascaded Pyramid Network for FashionAI Key Points Detection	Pytorch	1.2.5 and later
person-attributes-recognition-crossroad	person-attributes-recognition-crossroad-0230	Pytorch	1.2.10 and later

Depth estimation

Model	Reference	Exported From	Supported Ailia Version	Blog
monodepth2	Monocular depth estimation from a single image	Pytorch	1.2.2 and later
midas	Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer	Pytorch	1.2.4 and later	EN JP
fcrn-depthprediction	Deeper Depth Prediction with Fully Convolutional Residual Networks	TensorFlow	1.2.6 and later
fast-depth	ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems"	Pytorch	1.2.5 and later
lap-depth	LapDepth-release	Pytorch	1.2.9 and later
hitnet	ONNX-HITNET-Stereo-Depth-estimation	Pytorch	1.2.9 and later
crestereo	ONNX-CREStereo-Depth-Estimation	Pytorch	1.2.13 and later
mobilestereonet	MobileStereoNet	Pytorch	1.2.13 and later
zoe_depth	ZoeDepth	Pytorch	1.3.0 and later
DepthAnything	DepthAnything	Pytorch	1.2.9 and later

Diffusion

Text to image

Model	Reference	Exported From	Supported Ailia Version	Blog
latent-diffusion-txt2img	Latent Diffusion - txt2img	Pytorch	1.2.10 and later
stable-diffusion-txt2img	Stable Diffusion	Pytorch	1.2.14 and later	JP
control_net	ControlNet	Pytorch	1.2.15 and later
sd-turbo	Hugging Face - SD-Turbo	Pytorch	1.2.16 and later
sdxl-turbo	Hugging Face - SDXL-Turbo	Pytorch	1.2.16 and later
latent-consistency-models	latent-consistency-models	Pytorch	1.2.16 and later

Text to audio

	Model	Reference	Exported From	Supported Ailia Version	Blog
	riffusion	Riffusion	Pytorch	1.2.16 and later

Others

Model	Reference	Exported From	Supported Ailia Version
latent-diffusion-inpainting	Latent Diffusion - inpainting	Pytorch	1.2.10 and later
latent-diffusion-superresolution	Latent Diffusion - Super-resolution	Pytorch	1.2.10 and later
DA-CLIP	DA-CLIP	Pytorch	1.2.16 and later
marigold	Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation	Pytorch	1.2.16 and later

Face detection

Model	Reference	Exported From	Supported Ailia Version	Blog
yolov1-face	YOLO-Face-detection	Darknet	1.1.0 and later
yolov3-face	Face detection using keras-yolov3	Keras	1.2.1 and later
blazeface	BlazeFace-PyTorch	Pytorch	1.2.1 and later	EN JP
face-mask-detection	Face detection using keras-yolov3	Keras	1.2.1 and later	EN JP
dbface	DBFace : real-time, single-stage detector for face detection, with faster speed and higher accuracy	Pytorch	1.2.2 and later
retinaface	RetinaFace: Single-stage Dense Face Localisation in the Wild.	Pytorch	1.2.5 and later	JP
anime-face-detector	Anime Face Detector	Pytorch	1.2.6 and later
face-detection-adas	face-detection-adas-0001	OpenVINO	1.2.5 and later
mtcnn	mtcnn	Keras	1.2.10 and later

Face identification

Model	Reference	Exported From	Supported Ailia Version	Blog
vggface2	VGGFace2 Dataset for Face Recognition	Caffe	1.1.0 and later
arcface	pytorch implement of arcface	Pytorch	1.2.1 and later	EN JP
insightface	InsightFace: 2D and 3D Face Analysis Project	Pytorch	1.2.5 and later
cosface	Pytorch implementation of CosFace	Pytorch	1.2.10 and later
facenet_pytorch	Face Recognition Using Pytorch	Pytorch	1.2.6 and later

Face recognition

Age gender estimation

Model	Reference	Exported From	Supported Ailia Version	Blog
face_classification	Real-time face detection and emotion/gender classification	Keras	1.1.0 and later
age-gender-recognition-retail	age-gender-recognition-retail-0013	OpenVINO	1.2.5 and later	EN JP
mivolo	MiVOLO: Multi-input Transformer for Age and Gender Estimation	Pytorch	1.2.13 and later

Emotion recognition

	Model	Reference	Exported From	Supported Ailia Version	Blog
	ferplus	FER+	CNTK	1.2.2 and later
	hsemotion	HSEmotion (High-Speed face Emotion recognition) library	Pytorch	1.2.5 and later

Gaze estimation

Model	Reference	Exported From	Supported Ailia Version	Blog
gazeml	A deep learning framework based on Tensorflow for the training of high performance gaze estimation	TensorFlow	1.2.0 and later
mediapipe_iris	irislandmarks.pytorch	Pytorch	1.2.2 and later	EN JP
ax_gaze_estimation	ax Gaze Estimation	Pytorch	1.2.2 and later	EN JP

Head pose estimation

Model	Reference	Exported From	Supported Ailia Version	Blog
hopenet	deep-head-pose	Pytorch	1.2.2 and later	EN JP
6d_repnet	6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch)	Pytorch	1.2.6 and later
L2CS_Net	L2CS_Net	Pytorch	1.2.9 and later

Keypoint detection

Model	Reference	Exported From	Supported Ailia Version	Blog
facial_feature	kaggle-facial-keypoints	Pytorch	1.2.0 and later
face_alignment	2D and 3D Face alignment library build using pytorch	Pytorch	1.2.1 and later	EN JP
prnet	Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network	TensorFlow	1.2.2 and later
facemesh	facemesh.pytorch	Pytorch	1.2.2 and later	EN JP
facemesh_v2	MediaPipe Face landmark detection	Pytorch	1.2.9 and later	JP
3ddfa	Towards Fast, Accurate and Stable 3D Dense Face Alignment	Pytorch	1.2.10 and later

Others

	Model	Reference	Exported From	Supported Ailia Version	Blog
	ax_facial_features	ax Facial Features	Pytorch	1.2.5 and later	EN
	face-anti-spoofing	Lightweight Face Anti Spoofing	Pytorch	1.2.5 and later	EN JP

Face restoration

	Model	Reference	Exported From	Supported Ailia Version	Blog
	gfpgan	GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior	Pytorch	1.2.10 and later	JP
	codeformer	CodeFormer: Towards Robust Blind Face Restoration with Codebook Lookup Transformer	Pytorch	1.2.9 and later

Face swapping

	Model	Reference	Exported From	Supported Ailia Version	Blog
	sber-swap	SberSwap	Pytorch	1.2.12 and later	JP
	facefusion	FaceFusion	ONNXRuntime	1.2.10 and later

Frame Interpolation

Model	Reference	Exported From	Supported Ailia Version	Blog
flavr	FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation	Pytorch	1.2.7 and later	EN JP
cain	Channel Attention Is All You Need for Video Frame Interpolation	Pytorch	1.2.5 and later
film	FILM: Frame Interpolation for Large Motion	Tensorflow	1.2.10 and later
rife	Real-Time Intermediate Flow Estimation for Video Frame Interpolation	Pytorch	1.2.13 and later

Generative adversarial networks

Model	Reference	Exported From	Supported Ailia Version	Blog
pytorch-gan	Code repo for the Pytorch GAN Zoo project (used to train this model)	Pytorch	1.2.4 and later
council-gan	Council-GAN	Pytorch	1.2.4 and later
restyle-encoder	ReStyle	Pytorch	1.2.9 and later
sam	Age Transformation Using a Style-Based Regression Model	Pytorch	1.2.9 and later
encoder4editing	Designing an Encoder for StyleGAN Image Manipulation	Pytorch	1.2.10 and later
lipgan	LipGAN	Keras	1.2.15 and later	JP

Hand detection

Model	Reference	Exported From	Supported Ailia Version
yolov3-hand	Hand detection branch of Face detection using keras-yolov3	Keras	1.2.1 and later
hand_detection_pytorch	hand-detection.PyTorch	Pytorch	1.2.2 and later
blazepalm	MediaPipePyTorch	Pytorch	1.2.5 and later

Hand recognition

Model	Reference	Exported From	Supported Ailia Version	Blog
blazehand	MediaPipePyTorch	Pytorch	1.2.5 and later	EN JP
hand3d	ColorHandPose3D network	TensorFlow	1.2.5 and later
minimal-hand	Minimal Hand	TensorFlow	1.2.8 and later
v2v-posenet	V2V-PoseNet	Pytorch	1.2.6 and later
hands_segmentation_pytorch	hands-segmentation-pytorch	Pytorch	1.2.10 and later

Image captioning

Model	Reference	Exported From	Supported Ailia Version	Blog
illustration2vec	Illustration2Vec	Caffe	1.2.2 and later
image_captioning_pytorch	Image Captioning pytorch	Pytorch	1.2.5 and later	EN JP
blip2	Hugging Face - BLIP-2	Pytorch	1.2.16 and later

Image classification

CNN

Model	Reference	Exported From	Supported Ailia Version	Blog
alexnet	AlexNet PyTorch	Pytorch	1.2.5 and later
vgg16	Very Deep Convolutional Networks for Large-Scale Image Recognition	Keras	1.1.0 and later
googlenet	Going Deeper with Convolutions	Pytorch	1.2.0 and later
resnet18	ResNet18	Pytorch	1.2.8 and later
resnet50	Deep Residual Learning for Image Recognition	Chainer	1.2.0 and later
wide_resnet50	Wide Resnet	Pytorch	1.2.5 and later
inceptionv3	Rethinking the Inception Architecture for Computer Vision	Pytorch	1.2.0 and later	JP
inceptionv4	Keras Inception-V4	Keras	1.2.5 and later
mobilenetv2	PyTorch Implemention of MobileNet V2	Pytorch	1.2.0 and later
mobilenetv3	PyTorch Implemention of MobileNet V3	Pytorch	1.2.1 and later
efficientnet	A PyTorch implementation of EfficientNet	Pytorch	1.2.3 and later
efficientnetv2	EfficientNetV2	Pytorch	1.2.4 and later
mlp_mixer	MLP-Mixer	Pytorch	1.2.9 and later
convnext	A PyTorch implementation of ConvNeXt	Pytorch	1.2.5 and later
mobileone	A PyTorch implementation of MobileOne	Pytorch	1.2.1 and later
imagenet21k	ImageNet21K	Pytorch	1.2.11 and later
volo	VOLO: Vision Outlooker for Visual Recognition	Pytorch	1.2.9 and later

Transformer

Model	Reference	Exported From	Supported Ailia Version	Blog
vit	Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)	Pytorch	1.2.7 and later	EN JP
swin-transformer	Swin Transformer	Pytorch	1.2.6 and later
clip	CLIP	Pytorch	1.2.9 and later	EN JP
japanese-clip	Japanese-CLIP	Pytorch	1.2.15 and later
japanese-stable-clip-vit-l-16	japanese-stable-clip-vit-l-16	Pytorch	1.2.11 and later

Specific task

	Model	Reference	Exported From	Supported Ailia Version	Blog
	partialconv	Partial Convolution Layer for Padding and Image Inpainting	Pytorch	1.2.0 and later
	weather-prediction-from-image	Weather Prediction From Image - (Warmth Of Image)	Keras	1.2.5 and later

Image inpainting

Model	Reference	Exported From	Supported Ailia Version	Blog
inpainting-with-partial-conv	pytorch-inpainting-with-partial-conv	PyTorch	1.2.6 and later	EN JP
inpainting_gmcnn	Image Inpainting via Generative Multi-column Convolutional Neural Networks	TensorFlow	1.2.6 and later
3d-photo-inpainting	3D Photography using Context-aware Layered Depth Inpainting	Pytorch	1.2.7 and later
deepfillv2	Free-Form Image Inpainting with Gated Convolution	Pytorch	1.2.9 and later

Image manipulation

Model	Reference	Exported From	Supported Ailia Version	Blog
noise2noise	Learning Image Restoration without Clean Data	Pytorch	1.2.0 and later
dewarpnet	DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks	Pytorch	1.2.1 and later
illnet	Document Rectification and Illumination Correction using a Patch-based CNN	Pytorch	1.2.2 and later
colorization	Colorful Image Colorization	Pytorch	1.2.2 and later	EN JP
u2net_portrait	U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection	Pytorch	1.2.2 and later
style2paints	Style2Paints	TensorFlow	1.2.6 and later
deep_white_balance	Deep White-Balance Editing, CVPR 2020 (Oral)	PyTorch	1.2.6 and later
deblur_gan	DeblurGAN	Pytorch	1.2.6 and later
invertible_denoising_network	Invertible Image Denoising	Pytorch	1.2.8 and later
dfm	Deep Feature Matching	Pytorch	1.2.6 and later
dfe	Deep Fundamental Matrix Estimation	Pytorch	1.2.6 and later
dehamer	Image Dehazing Transformer with Transmission-Aware 3D Position Embedding	Pytorch	1.2.13 and later
pytorch-superpoint	pytorch-superpoint : Self-Supervised Interest Point Detection and Description	Pytorch	1.2.6 and later
cnngeometric_pytorch	CNNGeometric PyTorch implementation	Pytorch	1.2.7 and later
lightglue	LightGlue-ONNX	Pytorch	1.2.15 and later
docshadow	DocShadow-ONNX-TensorRT	Pytorch	1.2.10 and later
fbcnn	Towards Flexible Blind JPEG Artifacts Removal	Pytorch	1.2.9 and later

Image restoration

	Model	Reference	Exported From	Supported Ailia Version	Blog
	nafnet	NAFNet: Nonlinear Activation Free Network for Image Restoration	Pytorch	1.2.10 and later

Image segmentation

Model	Reference	Exported From	Supported Ailia Version	Blog
deeplabv3	Xception65 for backbone network of DeepLab v3+	Chainer	1.2.0 and later
hrnet_segmentation	High-resolution networks (HRNets) for Semantic Segmentation	Pytorch	1.2.1 and later
hair_segmentation	hair segmentation in mobile device	Keras	1.2.1 and later
pspnet-hair-segmentation	pytorch-hair-segmentation	Pytorch	1.2.2 and later
human_part_segmentation	Self Correction for Human Parsing	Pytorch	1.2.4 and later	EN JP
semantic-segmentation-mobilenet-v3	Semantic segmentation with MobileNetV3	TensorFlow	1.2.5 and later
pytorch-unet	Pytorch-Unet	Pytorch	1.2.5 and later
pytorch-enet	PyTorch-ENet	Pytorch	1.2.8 and later
yet-another-anime-segmenter	Yet Another Anime Segmenter	Pytorch	1.2.6 and later
swiftnet	SwiftNet	Pytorch	1.2.6 and later
dense_prediction_transformers	Vision Transformers for Dense Prediction	Pytorch	1.2.7 and later	EN JP
paddleseg	PaddleSeg	Pytorch	1.2.7 and later	EN JP
pp_liteseg	PP-LiteSeg	Pytorch	1.2.10 and later
suim	SUIM	Keras	1.2.6 and later
group_vit	GroupViT	Pytorch	1.2.10 and later
anime-segmentation	Anime Segmentation	Pytorch	1.2.9 and later
segment-anything	Segment Anything	Pytorch	1.2.16 and later
tusimple-DUC	TuSimple-DUC	Pytorch	1.2.10 and later
pytorch-fcn	pytorch-fcn	Pytorch	1.3.0 and later
grounded_sam	Grounded-SAM	Pytorch	1.2.16 and later
segment-anything-2	Segment Anything 2	Pytorch	1.2.16 and later

Large Language Model

Model	Reference	Exported From	Supported Ailia Version	Blog
llava	LLaVA	Pytorch	1.2.16 and later

Landmark classification

	Model	Reference	Exported From	Supported Ailia Version	Blog
	landmarks_classifier_asia	Landmarks classifier_asia_V1.1	TensorFlow Hub	1.2.4 and later	EN JP
	places365	Release of Places365-CNNs	Pytorch	1.2.5 and later

Line segment detection

	Model	Reference	Exported From	Supported Ailia Version	Blog
	mlsd	M-LSD: Towards Light-weight and Real-time Line Segment Detection	TensorFlow	1.2.8 and later	EN JP
	dexined	DexiNed: Dense Extreme Inception Network for Edge Detection	Pytorch	1.2.5 and later

Low Light Image Enhancement

	Model	Reference	Exported From	Supported Ailia Version	Blog
	agllnet	AGLLNet: Attention Guided Low-light Image Enhancement (IJCV 2021)	Pytorch	1.2.9 and later	EN JP
	drbn_skf	DRBN SKF	Pytorch	1.2.14 and later

Natural language processing

Bert

Model	Reference	Exported From	Supported Ailia Version	Blog
bert	pytorch-pretrained-bert	Pytorch	1.2.2 and later	EN JP
bert_maskedlm	huggingface/transformers	Pytorch	1.2.5 and later
bert_question_answering	huggingface/transformers	Pytorch	1.2.5 and later
bert_zero_shot_classification	huggingface/transformers	Pytorch	1.2.5 and later

Embedding

Model	Reference	Exported From	Supported Ailia Version	Blog
sentence_transformers_japanese	sentence transformers	Pytorch	1.2.7 and later	JP
multilingual-e5	multilingual-e5-base	Pytorch	1.2.15 and later	JP
glucose	GLuCoSE (General Luke-based Contrastive Sentence Embedding)-base-Japanese	Pytorch	1.2.15 and later

Error corrector

Model	Reference	Exported From	Supported Ailia Version
bert_insert_punctuation	bert-japanese	Pytorch	1.2.15 and later
t5_whisper_medical	error correction of medical terms using t5	Pytorch	1.2.13 and later
bertjsc	bertjsc	Pytorch	1.2.15 and later

Grapheme to phoneme

Model	Reference	Exported From	Supported Ailia Version	Blog
soundchoice-g2p	Hugging Face - speechbrain/soundchoice-g2p	Pytorch	1.2.16 and later
g2p_en	g2p_en	Pytorch	1.2.14 and later

Named entity recognition

Model	Reference	Exported From	Supported Ailia Version
bert_ner	huggingface/transformers	Pytorch	1.2.5 and later
t5_base_japanese_ner	t5-japanese	Pytorch	1.2.13 and later
bert_ner_japanese	jurabi/bert-ner-japanese	Pytorch	1.2.10 and later

Reranker

Model	Reference	Exported From	Supported Ailia Version	Blog
cross_encoder_mmarco	jeffwan/mmarco-mMiniLMv2-L12-H384-v	Pytorch	1.2.10 and later	JP
japanese-reranker-cross-encoder	hotchpotch/japanese-reranker-cross-encoder-large-v1	Pytorch	1.2.16 and later

Sentence generation

Model	Reference	Exported From	Supported Ailia Version	Blog
gpt2	GPT-2	Pytorch	1.2.7 and later
rinna_gpt2	japanese-pretrained-models	Pytorch	1.2.7 and later

Sentiment analysis

Model	Reference	Exported From	Supported Ailia Version	Blog
bert_sentiment_analysis	huggingface/transformers	Pytorch	1.2.5 and later
bert_tweets_sentiment	huggingface/transformers	Pytorch	1.2.5 and later

Summarize

Model	Reference	Exported From	Supported Ailia Version	Blog
bert_sum_ext	BERTSUMEXT	Pytorch	1.2.7 and later
presumm	PreSumm	Pytorch	1.2.8 and later
t5_base_japanese_title_generation	t5-japanese	Pytorch	1.2.13 and later	JP
t5_base_summarization	t5-japanese	Pytorch	1.2.13 and later

Translation

Model	Reference	Exported From	Supported Ailia Version	Blog
fugumt-en-ja	Fugu-Machine Translator	Pytorch	1.2.9 and later	JP
fugumt-ja-en	Fugu-Machine Translator	Pytorch	1.2.10 abd later

Network intrusion detection

Model	Reference	Exported From	Supported Ailia Version	Blog
bert-network-packet-flow-header-payload	bert-network-packet-flow-header-payload	Pytorch	1.2.10 and later
falcon-adapter-network-packet	falcon-adapter-network-packet	Pytorch	1.2.10 and later

Neural Rendering

	Model	Reference	Exported From	Supported Ailia Version	Blog
	nerf	NeRF: Neural Radiance Fields	Tensorflow	1.2.10 and later	EN JP
	TripoSR	TripoSR	Pytorch	1.2.6 and later

NSFW detector

Model	Reference	Exported From	Supported Ailia Version	Blog
clip-based-nsfw-detector	CLIP-based-NSFW-Detector	Keras	1.2.10 and later	JP

Object detection

CNN

Model	Reference	Exported From	Supported Ailia Version	Blog
yolov1-tiny	YOLO: Real-Time Object Detection	Darknet	1.1.0 and later	JP
yolov2	YOLO: Real-Time Object Detection	Pytorch	1.2.0 and later
yolov2-tiny	YOLO: Real-Time Object Detection	Pytorch	1.2.6 and later
yolov3	YOLO: Real-Time Object Detection	ONNX Runtime	1.2.1 and later	EN JP
yolov3-tiny	YOLO: Real-Time Object Detection	ONNX Runtime	1.2.1 and later
yolov4	Pytorch-YOLOv4	Pytorch	1.2.4 and later	EN JP
yolov4-tiny	Pytorch-YOLOv4	Pytorch	1.2.5 and later
yolov5	yolov5	Pytorch	1.2.5 and later	EN JP
yolov6	YOLOV6	Pytorch	1.2.10 and later
yolov7	YOLOv7	Pytorch	1.2.7 and later
yolov8	YOLOv8	Pytorch	1.2.14.1 and later
yolov8-seg	YOLOv8	Pytorch	1.2.14.1 and later
yolov9	YOLOv9	Pytorch	1.2.10 and later
yolor	yolor	Pytorch	1.2.5 and later
yolox	YOLOX	Pytorch	1.2.6 and later	EN JP
yolox-ti-lite	edgeai-yolox	Pytorch	1.2.9 and later
yolov	YOLOV	Pytorch	1.2.10 and later
mobilenet_ssd	MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch	Pytorch	1.2.1 and later	EN JP
maskrcnn	Mask R-CNN: real-time neural network for object instance segmentation	Pytorch	1.2.3 and later
m2det	M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network	Pytorch	1.2.3 and later	EN JP
centernet	CenterNet : Objects as Points	Pytorch	1.2.1 and later	EN JP
pedestrian_detection	Pedestrian-Detection-on-YOLOv3_Research-and-APP	Keras	1.2.1 and later
efficientdet	EfficientDet: Scalable and Efficient Object Detection, in PyTorch	Pytorch	1.2.6 and later
nanodet	NanoDet	Pytorch	1.2.6 and later
picodet	PP-PicoDet	Pytorch	1.2.10 and later
yolact	You Only Look At CoefficienTs	Pytorch	1.2.6 and later
fastest-det	FastestDet	Pytorch	1.2.5 and later
poly_yolo	Poly YOLO	Keras	1.2.6 and later
crowd_det	Detection in Crowded Scenes	Pytorch	1.2.13 and later
damo_yolo	DAMO-YOLO	Pytorch	1.2.9 and later

Transformer

Model	Reference	Exported From	Supported Ailia Version	Date	Blog
glip	GLIP	Pytorch	1.2.13 and later	Dec 2021
dab-detr	DAB-DETR	Pytorch	1.2.12 and later	Jan 2022
detic	Detecting Twenty-thousand Classes using Image-level Supervision	Pytorch	1.2.10 and later	Jan 2022	EN JP
groundingdino	Grounding DINO	Pytorch	1.2.16 and later	Mar 2023	JP

Specific target

Model	Reference	Exported From	Supported Ailia Version	Blog
mobile_object_localizer	mobile_object_localizer_v1	TensorFlow Hub	1.2.6 and later	EN JP
sku110k-densedet	SKU110K-DenseDet	Pytorch	1.2.9 and later	EN JP
traffic-sign-detection	Traffic Sign Detection	Tensorflow	1.2.10 and later	EN JP
footandball	FootAndBall: Integrated player and ball detector	Pytorch	1.2.0 and later
qrcode_wechatqrcode	qrcode_wechatqrcode	Caffe	1.2.15 and later
layout_parsing	unstructured-inference	Pytorch	1.2.9 and later

Object detection 3d

Model	Reference	Exported From	Supported Ailia Version	Blog
3d_bbox	3D Bounding Box Estimation Using Deep Learning and Geometry	Pytorch	1.2.6 and later
3d-object-detection.pytorch	3d-object-detection.pytorch	Pytorch	1.2.8 and later	EN JP
mediapipe_objectron	MediaPipe Objectron	TensorFlow Lite	1.2.5 and later
egonet	EgoNet	Pytorch	1.2.9 and later
d4lcn	D4LCN	Pytorch	1.2.9 and later
did_m3d	DID M3D	Pytorch	1.2.11 and later

Object tracking

Model	Reference	Exported From	Supported Ailia Version	Blog
deepsort	Deep Sort with PyTorch	Pytorch	1.2.3 and later	EN JP
person_reid_baseline_pytorch	UTS-Person-reID-Practical	Pytorch	1.2.6 and later
abd_net	Attentive but Diverse Person Re-Identification	Pytorch	1.2.7 and later
siam-mot	SiamMOT	Pytorch	1.2.9 and later
bytetrack	ByteTrack	Pytorch	1.2.5 and later	EN JP
qd-3dt	Monocular Quasi-Dense 3D Object Tracking	Pytorch	1.2.11 and later
strong_sort	StrongSORT	Pytorch	1.2.15 and later
centroids-reid	On the Unreasonable Effectiveness of Centroids in Image Retrieval	Pytorch	1.2.9 and later
deepsort_vehicle	Multi-Camera Live Object Tracking	Pytorch	1.2.9 and later

Optical Flow Estimation

	Model	Reference	Exported From	Supported Ailia Version	Blog
	raft	RAFT: Recurrent All Pairs Field Transforms for Optical Flow	Pytorch	1.2.6 and later	EN JP

Point segmentation

	Model	Reference	Exported From	Supported Ailia Version	Blog
	pointnet_pytorch	PointNet.pytorch	Pytorch	1.2.6 and later

Pose estimation

Model	Reference	Exported From	Supported Ailia Version	Blog
openpose	Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)	Caffe	1.2.1 and later
lightweight-human-pose-estimation	Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper.	Pytorch	1.2.1 and later	EN JP
pose_resnet	Simple Baselines for Human Pose Estimation and Tracking	Pytorch	1.2.1 and later	EN JP
blazepose	MediaPipePyTorch	Pytorch	1.2.5 and later
efficientpose	Code repo for EfficientPose	TensorFlow	1.2.6 and later
movenet	Code repo for movenet	TensorFlow	1.2.8 and later	EN JP
animalpose	MMPose - 2D animal pose estimation	Pytorch	1.2.7 and later	EN JP
mediapipe_holistic	MediaPipe Holistic	TensorFlow	1.2.9 and later
ap-10k	AP-10K	Pytorch	1.2.4 and later
posenet	PoseNet Pytorch	Pytorch	1.2.10 and later
e2pose	E2Pose	Tensorflow	1.2.5 and later

Pose estimation 3d

Model	Reference	Exported From	Supported Ailia Version	Blog
lightweight-human-pose-estimation-3d	Real-time 3D multi-person pose estimation demo in PyTorch. OpenVINO backend can be used for fast inference on CPU.	Pytorch	1.2.1 and later
3d-pose-baseline	A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.	TensorFlow	1.2.3 and later
pose-hg-3d	Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach	Pytorch	1.2.6 and later
blazepose-fullbody	MediaPipe	TensorFlow Lite	1.2.5 and later	EN JP
3dmppe_posenet	PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image"	Pytorch	1.2.6 and later
gast	A Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video (GAST-Net)	Pytorch	1.2.7 and later	EN JP
mediapipe_pose_world_landmarks	MediaPipe Pose real-world 3D coordinates	TensorFlow Lite	1.2.10 and later

Road detection

Model	Reference	Exported From	Supported Ailia Version	Blog
codes-for-lane-detection	Codes-for-Lane-Detection	Pytorch	1.2.6 and later	EN JP
roneld	RONELD-Lane-Detection	Pytorch	1.2.6 and later
road-segmentation-adas	road-segmentation-adas-0001	OpenVINO	1.2.5 and later
cdnet	CDNet	Pytorch	1.2.5 and later
lstr	LSTR	Pytorch	1.2.8 and later
ultra-fast-lane-detection	Ultra-Fast-Lane-Detection	Pytorch	1.2.6 and later
yolop	YOLOP	Pytorch	1.2.6 and later
hybridnets	HybridNets	Pytorch	1.2.6 and later
polylanenet	PolyLaneNet	Pytorch	1.2.9 and later

Rotation prediction

	Model	Reference	Exported From	Supported Ailia Version	Blog
	rotnet	CNNs for predicting the rotation angle of an image to correct its orientation	Keras	1.2.1 and later

Style transfer

Model	Reference	Exported From	Supported Ailia Version	Blog
adain	Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization	Pytorch	1.2.1 and later	EN JP
psgan	PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer	Pytorch	1.2.7 and later
beauty_gan	BeautyGAN	Pytorch	1.2.7 and later
animeganv2	PyTorch Implementation of AnimeGANv2	Pytorch	1.2.5 and later
pix2pixHD	pix2pixHD: High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs	Pytorch	1.2.6 and later
EleGANt	EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer	Pytorch	1.2.15 and later

Super resolution

Model	Reference	Exported From	Supported Ailia Version	Blog
srresnet	Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network	Pytorch	1.2.0 and later	EN JP
edsr	Enhanced Deep Residual Networks for Single Image Super-Resolution	Pytorch	1.2.6 and later	EN JP
han	Single Image Super-Resolution via a Holistic Attention Network	Pytorch	1.2.6 and later
real-esrgan	Real-ESRGAN	Pytorch	1.2.9 and later
rcan-it	Revisiting RCAN: Improved Training for Image Super-Resolution	Pytorch	1.2.10 and later
swinir	SwinIR: Image Restoration Using Swin Transformer	Pytorch	1.2.12 and later
Hat	Hat	Pytorch	1.2.6 and later

Text detection

Model	Reference	Exported From	Supported Ailia Version
craft_pytorch	CRAFT: Character-Region Awareness For Text detection	Pytorch	1.2.2 and later
pixel_link	Pixel-Link	TensorFlow	1.2.6 and later
east	EAST: An Efficient and Accurate Scene Text Detector	TensorFlow	1.2.6 and later

Text recognition

Model	Reference	Exported From	Supported Ailia Version	Blog
etl	Japanese Character Classification	Keras	1.1.0 and later	JP
deep-text-recognition-benchmark	deep-text-recognition-benchmark	Pytorch	1.2.6 and later
crnn.pytorch	Convolutional Recurrent Neural Network	Pytorch	1.2.6 and later
paddleocr	PaddleOCR : Awesome multilingual OCR toolkits based on PaddlePaddle	Pytorch	1.2.6 and later	EN JP
easyocr	Ready-to-use OCR with 80+ supported languages	Pytorch	1.2.6 and later
ndlocr_text_recognition	NDL OCR	Pytorch	1.2.5 and later

Time-Series Forecasting

Model	Reference	Exported From	Supported Ailia Version	Blog
informer2020	Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting (AAAI'21 Best Paper)	Pytorch	1.2.10 and later

Vehicle recognition

	Model	Reference	Exported From	Supported Ailia Version	Blog
	vehicle-attributes-recognition-barrier	vehicle-attributes-recognition-barrier-0042	OpenVINO	1.2.5 and later	EN JP
	vehicle-license-plate-detection-barrier	vehicle-license-plate-detection-barrier-0106	OpenVINO	1.2.5 and later

Vision Language Model

	Model	Reference	Exported From	Supported Ailia Version	Blog
	florence2	Hugging Face - microsoft/Florence-2-base	Pytorch	1.2.16 and later

Commercial model

Model	Reference	Exported From	Supported Ailia Version	Blog
acculus-pose	Acculus, Inc.	Caffe	1.2.3 and later

Name		Name	Last commit message	Last commit date
Latest commit History 5,535 Commits
.vscode		.vscode
action_recognition		action_recognition
anomaly_detection		anomaly_detection
audio_processing		audio_processing
background_removal		background_removal
commercial_model/acculus-pose		commercial_model/acculus-pose
crowd_counting		crowd_counting
deep_fashion		deep_fashion
depth_estimation		depth_estimation
diffusion		diffusion
face_detection		face_detection
face_identification		face_identification
face_recognition		face_recognition
face_restoration		face_restoration
face_swapping		face_swapping
frame_interpolation		frame_interpolation
generative_adversarial_networks		generative_adversarial_networks
hand_detection		hand_detection
hand_recognition		hand_recognition
image_captioning		image_captioning
image_classification		image_classification
image_inpainting		image_inpainting
image_manipulation		image_manipulation
image_restoration/nafnet		image_restoration/nafnet
image_segmentation		image_segmentation
landmark_classification		landmark_classification
large_language_model/llava		large_language_model/llava
line_segment_detection		line_segment_detection
low_light_image_enhancement		low_light_image_enhancement
natural_language_processing		natural_language_processing
network_intrusion_detection		network_intrusion_detection
neural_rendering		neural_rendering
nsfw_detector/clip-based-nsfw-detector		nsfw_detector/clip-based-nsfw-detector
object_detection		object_detection
object_detection_3d		object_detection_3d
object_tracking		object_tracking
optical_flow_estimation/raft		optical_flow_estimation/raft
point_segmentation/pointnet_pytorch		point_segmentation/pointnet_pytorch
pose_estimation		pose_estimation
pose_estimation_3d		pose_estimation_3d
road_detection		road_detection
rotation_prediction/rotnet		rotation_prediction/rotnet
scripts		scripts
style_transfer		style_transfer
super_resolution		super_resolution
text_detection		text_detection
text_recognition		text_recognition
time_series_forecasting/informer2020		time_series_forecasting/informer2020
util		util
vehicle_recognition		vehicle_recognition
vision_language_model/florence2		vision_language_model/florence2
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
TUTORIAL.md		TUTORIAL.md
TUTORIAL_jp.md		TUTORIAL_jp.md
ailia-models.png		ailia-models.png
hello_ailia.ipynb		hello_ailia.ipynb
launcher.png		launcher.png
launcher.py		launcher.py
requirements.txt		requirements.txt

axinc-ai/ailia-models

Folders and files

Latest commit

History

Repository files navigation

About ailia SDK

How to use

Supported models

Latest update

Action recognition

Anomaly detection

Audio processing

Audio classification

Music enhancement

Music generation

Noise reduction

Phoneme alignment

Pitch detection

Speaker diarization

Speech to text

Text to speech

Voice activity detection

Voice conversion

Background removal

Crowd counting

Deep fashion

Depth estimation

Diffusion

Text to image

Text to audio

Others

Face detection

Face identification

Face recognition

Age gender estimation

Emotion recognition

Gaze estimation

Head pose estimation

Keypoint detection

Others

Face restoration

Face swapping

Frame Interpolation

Generative adversarial networks

Hand detection

Hand recognition

Image captioning

Image classification

CNN

Transformer

Specific task

Image inpainting

Image manipulation

Image restoration

Image segmentation

Large Language Model

Landmark classification

Line segment detection

Low Light Image Enhancement

Natural language processing

Bert

Embedding

Error corrector

Grapheme to phoneme

Named entity recognition

Reranker

Sentence generation

Sentiment analysis

Summarize

Translation

Network intrusion detection

Neural Rendering

NSFW detector

Object detection

CNN

Transformer

Specific target

Object detection 3d

Object tracking

Optical Flow Estimation

Packages