Stuff I Want To Read: AI: LLMs, prompting, platforms (hardware) and projects

LLMs for Ingesting Web Pages

Web element detection

OpenDILabCommunity/webpage_element_detection · Hugging Face

Layout parser - doc ingester

https://layout-parser.github.io/

YOLO Object Detection
https://www.v7labs.com/blog/yolo-object-detection

multimodal LLMs

HuggingFaceM4/idefics2-8b-chatty · Hugging Face

HuggingFaceM4/idefics2-8b · Hugging Face

HPT 1.5 Air: A New Open-Sourced 8B Multimodal LLM with Llama 3

CSS and HTML LLM

Tesslate/UIGEN-T2-7B-Q8_0-GGUF · Hugging Face

Tiny Multi-modal LLM - regions + OCR

microsoft/Florence-2-large · Hugging Face

Function Calling Optimized LLMs

Hermes-2 Mistral 7B - good for "Function Calling" (so LLM delegates creating/modifying an app, to the client)

https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B

MoE (Mixture of Experts) LLM

- faster inference

- can be more difficult to train

DBRX - a 'mixture of experts' LLM

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

MoE in depth article

https://cameronrwolfe.substack.com/p/moe-llms?utm_medium=android&triedRedirect=true

Building and Fine-tuning Models

Deploying and Fine-tuning Deepseek

https://huggingface.co/blog/deepseek-r1-aws

Fine-Tuning - NEL (Never Ending Learning)

The Power of Fine-Tuning on Your Data | Databricks Blog

Building a NN from scratch

MNIST Neural Network From Scratch (kaggle.com)

Data and Data Science

Data science and data trends (via ML) Python libraries

https://www.marktechpost.com/2024/05/15/10-python-packages-revolutionizing-data-science-workflow/

Data generation - Langchain

Synthetic data generation | 🦜️🔗 Langchain

Datasets - Roboflow (like HuggingFace for data)

https://public.roboflow.com/

Self-hosting LLMs

Accelerating Hugging Face Transformers with AWS Inferentia2

AWS Inferentia2 (inf2 AWS instances - cheaper than g5)

Faster Inference via TGI framework

https://huggingface.co/docs/text-generation-inference/en/index

Faster Inference via vLLM (even faster than TGI) framework

https://github.com/vllm-project/vllm

LLM GitHub accelerator projects

https://github.blog/2024-05-23-2024-github-accelerator-meet-the-11-projects-shaping-open-source-ai/

Prompt engineering guides

- The Prompt Report - https://arxiv.org/pdf/2406.06608

- https://www.promptingguide.ai

- Anthropic - [Answer Key] Anthropic's Prompt Engineering Interactive Tutorial [PUBLIC ACCESS] - Google Spreadsheets

- Claude Prompt Engineering & Function Calls

- Claude - Tool Use

- Optimize a prompt - Amazon Bedrock

- Prompt Smells, Just Like Code

AWS Bedrock Tips

- batch inference

https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html

AWS Inferentia - speculative decoding

Faster LLMs with speculative decoding and AWS Inferentia2 | AWS Machine Learning Blog

AWS Prompt Routing and prompt caching

- can auto pick an LLM?

Reduce costs and latency with Amazon Bedrock Intelligent Prompt Routing and prompt caching (preview) | AWS News Blog

Comparing Anthropic Claude LLMs
Claude 3.5 Sonnet (new) vs Claude 3.7 Sonnet - Detailed Performance & Feature Comparison

Other LLMs

SOTA LLM by Richard

https://huggingface.co/BasiliskLabs/Sovereign-0.1-72B

Transformer Models and Encoders

distilbert - can be trained to classify intent or detect a toxic user prompt

https://huggingface.co/distilbert/distilbert-base-uncased

- faster version of BERT

[1910.01108] DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (arxiv.org)

LLMs and Security

Strengthening Your LLM Application Security: Prompt hacking and testing

HackAPrompt

https://github.com/Giskard-AI/giskard

Design Patterns against Prompt Injection

https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/

Monitoring LLMs and Costs

Token counts - for cost estimates

https://huggingface.co/Xenova/claude-tokenizer

https://huggngface.co/spaces/Xenova/the-tokenizer-playground

https://docs.aws.amazon.com/bedrock/latest/userguide/model-customization-prepare.html

Stuff I Want To Read

Wednesday, March 20, 2024

AI: LLMs, prompting, platforms (hardware) and projects