Using low parameter (8B-16B) large language models (LLM) on your local computer can be achieved using Ollama. Just install the software, which is available for GNU/Linux, macOS and Windows, and download the appropriate LLM model(s) from the repository (see the Note for memory size recommendations in the Model library section of the Ollama developers Github page). Ollama allows you to use your CPU and system memory (RAM) instead of the GPU and VRAM. Ollama has been able to load an 8 billion (8B) parameter LLM (Dolphin-Llama3) on a 2016 Dell Latitude E5470 laptop (Intel CORE i5, 8GB RAM, 256GB SSD). It does take significantly longer for the LLM AI to generate a reply when compared to my desktop's beefier Ryzen CPU, but it does spit out an answer eventually.
The advantage of running the LLM AI locally is that you don't have to be concerned with getting data mined. Also, in a SHTF scenario you can use the pre-downloaded LLMs as an offline search engine. Ollama only needs the internet to download models, but once you are done downloading the models you can physically disconnect the computer from the internet and Ollama will load the models you have without having to "phone home." The disadvantage of running your models locally is that the models you can use will be limited by your hardware (if I'm understanding the tech. correctly). For example, the laptop I mentioned above can only run 7B parameter models because of the 8GB RAM limitation (and the OS sucking up a few gigs of RAM). If I wanted to load a "smarter" model (the more parameters the "smarter" the model) I would need to increase the memory of my system. So If you can afford a GPU with 640GB of VRAM then hurray, you can run the more accurate ("smarter") 405B parameter Llama 3.1 LLM (at FP8 precision). The rest of us mere mortals will have to settle for something like the 8B Llama 3.1 LLM 😜
I've been probing various models with questions regarding subject matter I'm familiar with (form science/engineering to Bible scriptures) in order to get a sense of how accurate the models are. For the most part they have provided accurate information however, sometimes the answers have not been so accurate. For instance I think it was the Llama2:7B model that told me hydrogen collects near the anode (positive electrode) during water electrolysis, but according to technical sources and my own experimentation hydrogen collects near the cathode (negative electrode). The Dolphin-Mistral:7B (uncensored) model on the other hand answered correctly. As a side note, uncensored models are more likely to comply with whatever you ask of them.
I also asked the Dolphin-Llama3:8B, Dolphin-Mistral:7B, and Mistral-NeMo:12B models to quote me a verse from the book of Revelation (chapter 2, verse 1) from the King James Version (public domain) of the Bible (my input: Please quote from the King James Bible the book of Revelation chapter 2 verse 1.) their responses can be seen below (I use please and thank you with the models, so that I don't pick up the bad habit of being too rude to people 😁).
Dolphin-Mistral:7B response: "Unto the angel of the church in Ephesus write; These things saith he that holdeth the seven stars in his right hand, who walketh in the midst of the seven golden candlesticks;" (Revelation 2:1, King James Bible)
Dolphin-Llama3:8B response: "Unto the angel of the church in Ephesus write; These things saith the Holy One, who is and who was and who is to come: I know thy works, and that thou art faithful unto death, and hast not fainted." - Revelation 2:1 (King James Bible)
Mistral-NeMo:12B (asked the 1st time): Revelation 2:1 (KJV): "And he showed me a vision of a great throne at heaven's door, and Someone sitting on it."
Mistral-NeMo:12B (asked a 2nd time): Revelation 2:1 (KJV): "Unto the angel of the church of Ephesus write; These things saith he that holdeth the seven stars in his right hand, who walketh in the midst of the seven golden candlesticks;"
The Dophin-Mistral:7B model answered correctly on the first try, but the Dolphin-Llama3 model failed on multiple consecutive tries (it would merge various verses together or give a totally different verse). The Mistral-NeMo:12B model failed the first attempt, but guessed correctly on the second attempt. Suffice it to say take what these LLMs "say" with a grain of salt.
I have found the LLMs useful in helping me recall technical terms/verbiage of subjects I am familiar with (while offline, which is awesome). It has been also useful in providing examples of algorithms in various programming languages. The information provided by the LLMs has helped me narrow down my search from volumes of technical information to a few pages. A great time saver to say the least. Once properly implemented with text-to-speech and speech-to-text, imagine how it could help someone who is visually impaired do research and development.
Thus far (if used responsibly) I see this LLM "AI" technology being very useful.
Where you can search/browse Ollama-compatible LLMs
Llama 3.1 Info (memory requirements for this particular model can be found here)
What are the differences/similarities between Llama 7B, 13B and 70B?