Wednesday, March 20, 2024

AI: LLMs, prompting, platforms (hardware) and projects


LLMs for Ingesting Web Pages


Web element detection

OpenDILabCommunity/webpage_element_detection · Hugging Face


Layout parser - doc ingester

https://layout-parser.github.io/





multimodal LLMs
HPT 1.5 Air: A New Open-Sourced 8B Multimodal LLM with Llama 3

CSS and HTML LLM


Tiny Multi-modal LLM - regions + OCR



Function Calling Optimized LLMs


Hermes-2 Mistral 7B - good for "Function Calling" (so LLM delegates creating/modifying an app, to the client)


MoE (Mixture of Experts) LLM

- faster inference
- can be more difficult to train

DBRX - a 'mixture of experts' LLM

MoE in depth article 


Building and Fine-tuning Models


Deploying and Fine-tuning Deepseek

Fine-Tuning -  NEL (Never Ending Learning)

Building a NN from scratch


Data and Data Science


Data science and data trends (via ML) Python libraries


Datasets - Roboflow (like HuggingFace for data)



Self-hosting LLMs

AWS Inferentia2 (inf2 AWS instances - cheaper than g5)

Faster Inference via TGI framework

Faster Inference via vLLM (even faster than TGI) framework
https://github.com/vllm-project/vllm


LLM GitHub accelerator projects 





Prompt engineering guides




AWS Bedrock Tips

- batch inference

AWS Inferentia - speculative decoding

AWS Prompt Routing and prompt caching
- can auto pick an LLM?


Other LLMs


SOTA LLM by Richard



Transformer Models and Encoders

distilbert - can be trained to classify intent or detect a toxic user prompt



LLMs and Security

Strengthening Your LLM Application Security: Prompt hacking and testing

Monitoring LLMs and Costs

Token counts - for cost estimates

No comments:

Post a Comment