K-Means clustering [can be expensive]
- elbow method - calc how many clusters, but can be expsensive
alt: DBSCAN which calculates how many clusters to use (density-based clustering non-parametric algorithm)https://en.wikipedia.org/wiki/DBSCAN
- elbow method - calc how many clusters, but can be expsensive
alt: DBSCAN which calculates how many clusters to use (density-based clustering non-parametric algorithm)https://en.wikipedia.org/wiki/DBSCAN
the parasitic mind
return of the god hypothesis
the myth of mental illness
factfulness
east of eden
the madness of crowds
the rational optimist
an anthropologist on mars
Resources for Publicly Available Datasets
Here are a few resources where you can look for publicly available datasets. While you should take advantage of available data, you should never fully trust it. Data needs to be thoroughly inspected and validated.
Always check a dataset's license before using it. Try your best to understand where the data comes from. Even if a dataset has a license that allows commercial use, it's possible that part of it comes from a source that doesn't:
10. The Stanford Large Network Dataset Collection (hts://oreilye_B) is a great
repository for graph datasets.
LocalLLaMA - gen-ai on reddit
https://www.reddit.com/r/LocalLLaMA/
LLM Leaderboard - https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
Superintelligence
the beginning of infinity
The Decline and Fall of the Roman Empire
Black Lamb and Grey Falcon
The Lessons of History