Saturday, December 7, 2024

Exploring Multi-Agent Frameworks, Books and Papers

Exploring Multi-Agent Frameworks, Books and Papers

In the rapidly evolving field of artificial intelligence (AI), open-source libraries and frameworks have emerged as valuable resources for developers and researchers. This blog post aims to provide an overview of some notable open-source AI agent libraries, their features, and potential use cases.

Saturday, August 31, 2024

War-time and Spy Books

Exile's Return

cowley - killing thatcher (doug)

Seventeen Moments of Spring (soviet spy)

(done) operation mince meat


Sunday, June 16, 2024

Food and Cooking - recipes

from Kitchen Confidential- Anthony Bourdain

Orwell's Down and Out in Paris and London. 

Nicolas Freeling's The Kitchen, 

David Blum's Flash in the Pan, 

the Batterberrys' fine account of American restaurant history, On the Town in New York, 

Joseph Mitchell's Up in the Old Hotel.

Read the old masters: Escoffier, Bocuse et al.

as well as the Young Turks: Keller, Marco-Pierre White

———————

Jamie Oliver recipes

Caponata - aubergine stew

https://www.jamieoliver.com/recipes/vegetables-recipes/incredible-sicilian-aubergine-stew-caponata/



Sunday, April 28, 2024

Thursday, April 11, 2024

AI: algorithms


K-Means clustering [can be expensive]

- elbow method - calc how many clusters, but can be expsensive

alt: DBSCAN which calculates how many clusters to use (density-based clustering non-parametric algorithm)

https://en.wikipedia.org/wiki/DBSCAN



Tuesday, April 9, 2024

books read by Jordan Peterson

the parasitic mind

return of the god hypothesis

the myth of mental illness

factfulness

east of eden

the madness of crowds

the rational optimist

an anthropologist on mars


Wednesday, March 20, 2024

AI: LLMs, platforms (hardware) and projects

(AI in Business) - Applied Artificial Intelligence

https://www.bookdepository.com/Applied-Artificial-Intelligence-Where-AI-Can-Be-Used-Business-Francesco-Corea/9783319772516?ref=grid-view&qid=1592465308969&sr=1-1 


Web element detection

OpenDILabCommunity/webpage_element_detection · Hugging Face


Layout parser - doc ingester

https://layout-parser.github.io/




Data generation - Langchain


DistilBERT: Faster version of BERT


DBRX - a 'mixture of experts' LLM

MoE in depth article 

Deploying and Fine-tuning Deepseek

Hermes-2 Mistral 7B - good for "Function Calling" (so LLM delegates creating/modifying an app, to the client)


multimodal LLMs
HPT 1.5 Air: A New Open-Sourced 8B Multimodal LLM with Llama 3


Datasets - Roboflow (like HuggingFace for data)

AWS Inferentia2 (inf2 AWS instances - cheaper than g5)

Prompt hacking and testing

Token counts - for cost estimates

LLM powered autonomous agents

Building a NN from scratch

Data science and data trends (via ML) Python libraries

evaluating LLMs
prometheus-2

LLM GitHub accelerator projects 



distilbert - can train it to classify intent or toxic

Prompt engineering guides

Tiny Multi-modal LLM - regions + OCR

AWS Bedrock Tips
- batch inference

AWS Inferentia - speculative decoding

AWS Prompt Routing and prompt caching
- can auto pick an LLM?

SOTA LLM by Richard

Wednesday, March 6, 2024

AI Papers, Books and Datasets

Learning Transferable Visual Models From Natural Language Supervision = 
https://arxiv.org/abs/2103.00020

-

Datasets

from AI Engineering:

Resources for Publicly Available Datasets

Here are a few resources where you can look for publicly available datasets. While you should take advantage of available data, you should never fully trust it. Data needs to be thoroughly inspected and validated.

Always check a dataset's license before using it. Try your best to understand where the data comes from. Even if a dataset has a license that allows commercial use, it's possible that part of it comes from a source that doesn't:

  1. Hugging Face (https://oreil.ly/tlt5h) and Kaggle (https://oreil.ly/g8A4a) each host
    hundreds of thousands of datasets.
  2. Google has a wonderful and underrated Dataset Search (https://oreil.ly/TgOaR).
  3. Governments are often great providers of open data. Data.gov (https://data.gov) hosts hundreds of thousands of datasets, and data.gov.in (https://data.gov.in) hosts tens of thousands.
  4. University of Michigan's Institute for Social Research (https://oreil.ly/VhVzp)
    ICPSR has data from tens of thousands of social studies.
  5. UC Irvine's Machine Learning Repository (https://oreil.ly/jAR9e) and OpenML (https://oreil.ly/d-Yty) are two older dataset repositories, each hosting several thousand datasets.
  6. The Open Data Network (https://oreil.ly/_tW6P) lets you search among tens of thousands of datasets.
  7. Cloud service providers often host a small collection of open datasets; the most notable one is AWS's Open Data (https://oreil.ly/DZ5uV).
  8. ML frameworks often have small pre-built datasets that you can load while using the framework, such as TensorFlow datasets (https://oreil.ly/HMJX_).
  9. Some evaluation harness tools host evaluation benchmark datasets that are suff-ciently large for PEFT finetuning. For example, Eleuther AI's Im-evaluation-harness (https://github.com/EleutherAl/m-evaluation-harness) hosts 400+ benchmark datasets, averaging 2,000+ examples per dataset.

10. The Stanford Large Network Dataset Collection (hts://oreilye_B) is a great

repository for graph datasets.


Free Datasets

UI Modelling - annotated UIs

Datasets - Roboflow (like HuggingFace for data)

Dataset - Anthropic use of AI by job category 

-

MoE (Mixture of Experts)


An Introduction to Vision-Language Modeling

Vision AI - VLMs and CNNs
- Vision Language Models
- Convolutional Neural Networks

Multi-modal LLMs

Movies: rated [for collaborative filtering]

The Society of Mind
Book by Marvin Minsky

AI Engineering[Book] [O'Reilly] [Zeki recommends]
same author: Designing Machine Learning Systems- DMLS focuses on building applications on top of traditional ML models, which involves more tabular data annotations, feature engineering, and model training

Tuesday, February 13, 2024

gen-ai news and updates

 

LocalLLaMA - gen-ai on reddit

https://www.reddit.com/r/LocalLLaMA/


LLM Leaderboard - https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard


Superintelligence


the beginning of infinity

The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
Book by Kavita Ganesan
- but neglects challenge of culture change and ethics
- profit vs ethics

.