NVIDIA Introduces RankRAG: A Novel RAG Framework that Instruction-Tunes a Single LLM for the Dual Purposes of Top-k Context Ranking and Answer Generation in RAG https://lnkd.in/dWXG3EfX
Rubem Didini’s Post
More Relevant Posts
-
One more open source LLM - DBRx - A New State-of-the-Art Open LLM ! 🌟 https://lnkd.in/gVNuCiJi Trained on 3072 NVIDIA H100s for 90 days - Demand for H100s going High!!! Model parameters: 132b Active parameters: 32b 💡 Despite having 132 billion model parameters, DBRx utilises MOE to efficiently utilise resources, with only 4 experts out of 16 experts used in inference, resulting in active parameters being just 32 billion! 🤯 #DBRx
Introducing DBRX: A New State-of-the-Art Open LLM | Databricks
databricks.com
To view or add a comment, sign in
-
How I fine-tuned an LLM to write like me! The latest trend in AI is Large Language Models (LLMs) and their ability to generate text, images and even code. I wanted to try one out and see how it would perform. I started by using the excellent and open-source tool Hugging Face’s Transformers For NLP, which allows you to browse, download and use pre-trained models. I downloaded one of the most popular models - LLaMA - and used it to generate some text. It was pretty good, but not great. It lacked the nuance and personality that I try to bring to my posts. So I decided to fine-tune it! Fine-tuning an LLM is a simple process. You need to have a pre-trained model and some Python code. Here’s the code I used: https://lnkd.in/dJ5YHSpK It’s still not as good as my writing, but it’s pretty close! I’m sure that with more fine-tuning it will get even better. What do you think? Have you tried anything similar? -------------------------------------------------- Talk to me about AI, NLP, Machine Learning and Data Science! 👨💻 I’m Mikkel Jensen, a Data Scientist and AI consultant. #AI #NLP #MachineLearning #DataScience -------------------------------------------------------------------------------------------------- This is part two in a series meant to showcase anecdotal evidence, for using an LLM's context window versus fine-tuning it. Link to the first post in comments 👇 The text above was created using the fine-tune version of Llama-2, tuned on my old LinkedIn posts. I changed 2-3 minor things from the prompt output. Find the original output in the code! Things I notice from fine-tuning: Hallucination seems to be a big problem Seems to resemble my style of writing much better than when using the context window Next up: Fine-tuning with more prompt engineering. Feel free to steal the code!
GitHub - mikkel92/LLMs-for-LinkedIn
github.com
To view or add a comment, sign in
-
You want to learn how to fine-tune a Large Language Model (LLM) on a consumer-type hardware (e.g. free-tier Google Colab instance)? Check out how to do it, through a step by step tutorial in this recent blogpost from PyTorch: https://lnkd.in/e5jgvein Original post: https://lnkd.in/e9-DTvC4
Finetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystem
pytorch.org
To view or add a comment, sign in
-
Got GPU? https://lnkd.in/gRapK2AR Learn how to put your GPU to work at #THETACON Visit - Thetatoken.org or Thetacon.org to learn more
Theta EdgeCloud: Ushering in a new era of AI Computing.
medium.com
To view or add a comment, sign in
-
If you'd like to accelerate your model's inference performance, follow along Het Trivedi's solid guide to deploying LLMs into production using Nvidia's open-source TensorRT LLM framework.
Deploying LLMs Into Production Using TensorRT LLM
towardsdatascience.com
To view or add a comment, sign in
-
Merge Large Language Models with mergekit Model merging is a technique that combines two or more LLMs into a single model. It's a relatively new and experimental method to create new models for cheap (no GPU required). Model merging works surprisingly well and produced many state-of-the-art models on the Open LLM Leaderboard. Blogbeitrag auf HuggingFace https://lnkd.in/eRgEJvmq
To view or add a comment, sign in
-
Simple explanation of RAG from NVIDIA blogs….
What Is Retrieval-Augmented Generation aka RAG?
blogs.nvidia.com
To view or add a comment, sign in
-
If you'd like to accelerate your model's inference performance, follow along Het Trivedi's solid guide to deploying LLMs into production using Nvidia's open-source TensorRT LLM framework.
Deploying LLMs Into Production Using TensorRT LLM
towardsdatascience.com
To view or add a comment, sign in
-
If similar to me some of you are trying to fit LLM fine-tuning experiments onto a single consumer GPU have a look at a short and simple article I wrote on this topic #LLM #finetuning #GPU #GPUexperiments #LLMoptimization #singleGPU #machinelearning #deeplearning #neuralnetworks #techarticle #GPUperformance:
vRAM Requirements for LLM Fine-Tuning
medium.com
To view or add a comment, sign in
-
In a comprehensive technical guide, Chaim Rand zooms in on model-training approaches, and discusses "one of the more advanced optimization techniques — one that sets apart the true rock stars from the simple amateurs — creating a custom PyTorch operator in C++ and CUDA."
Accelerating AI/ML Model Training with Custom Operators
towardsdatascience.com
To view or add a comment, sign in