Rubem Didini’s Post

2mo

NVIDIA Introduces RankRAG: A Novel RAG Framework that Instruction-Tunes a Single LLM for the Dual Purposes of Top-k Context Ranking and Answer Generation in RAG https://lnkd.in/dWXG3EfX

NVIDIA Introduces RankRAG: A Novel RAG Framework that Instruction-Tunes a Single LLM for the Dual Purposes of Top-k Context Ranking and Answer Generation in RAG

https://www.marktechpost.com

To view or add a comment, sign in

More Relevant Posts

SeshuRaju Pentakota

Senior Engineering Manager | Kaggle Master | IP Author with Gen AI @ Dolcera | IIT Bombay
5mo Edited
Report this post
One more open source LLM - DBRx - A New State-of-the-Art Open LLM ! 🌟 https://lnkd.in/gVNuCiJi Trained on 3072 NVIDIA H100s for 90 days - Demand for H100s going High!!! Model parameters: 132b Active parameters: 32b 💡 Despite having 132 billion model parameters, DBRx utilises MOE to efficiently utilise resources, with only 4 experts out of 16 experts used in inference, resulting in active parameters being just 32 billion! 🤯 #DBRx

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

databricks.com
Like Comment
To view or add a comment, sign in
🚀 Mikkel Jensen

CDSO at proprty.ai
6mo
Report this post
How I fine-tuned an LLM to write like me! The latest trend in AI is Large Language Models (LLMs) and their ability to generate text, images and even code. I wanted to try one out and see how it would perform. I started by using the excellent and open-source tool Hugging Face’s Transformers For NLP, which allows you to browse, download and use pre-trained models. I downloaded one of the most popular models - LLaMA - and used it to generate some text. It was pretty good, but not great. It lacked the nuance and personality that I try to bring to my posts. So I decided to fine-tune it! Fine-tuning an LLM is a simple process. You need to have a pre-trained model and some Python code. Here’s the code I used: https://lnkd.in/dJ5YHSpK It’s still not as good as my writing, but it’s pretty close! I’m sure that with more fine-tuning it will get even better. What do you think? Have you tried anything similar? -------------------------------------------------- Talk to me about AI, NLP, Machine Learning and Data Science! 👨💻 I’m Mikkel Jensen, a Data Scientist and AI consultant. #AI #NLP #MachineLearning #DataScience -------------------------------------------------------------------------------------------------- This is part two in a series meant to showcase anecdotal evidence, for using an LLM's context window versus fine-tuning it. Link to the first post in comments 👇 The text above was created using the fine-tune version of Llama-2, tuned on my old LinkedIn posts. I changed 2-3 minor things from the prompt output. Find the original output in the code! Things I notice from fine-tuning: Hallucination seems to be a big problem Seems to resemble my style of writing much better than when using the context window Next up: Fine-tuning with more prompt engineering. Feel free to steal the code!

GitHub - mikkel92/LLMs-for-LinkedIn

github.com

3 Comments
Like Comment
To view or add a comment, sign in
Younes Belkada

Machine Learning Engineer
8mo
Report this post
You want to learn how to fine-tune a Large Language Model (LLM) on a consumer-type hardware (e.g. free-tier Google Colab instance)? Check out how to do it, through a step by step tutorial in this recent blogpost from PyTorch: https://lnkd.in/e5jgvein Original post: https://lnkd.in/e9-DTvC4

Finetune LLMs on your own consumer hardware using tools from PyTorch and Hugging Face ecosystem

pytorch.org

6 Comments
Like Comment
To view or add a comment, sign in
THETA Community Conference Committee Inc.

191 followers
6mo
Report this post
Got GPU? https://lnkd.in/gRapK2AR Learn how to put your GPU to work at #THETACON Visit - Thetatoken.org or Thetacon.org to learn more

Theta EdgeCloud: Ushering in a new era of AI Computing.

medium.com
Like Comment
To view or add a comment, sign in
Towards Data Science

635,771 followers
7mo Edited
Report this post
If you'd like to accelerate your model's inference performance, follow along Het Trivedi's solid guide to deploying LLMs into production using Nvidia's open-source TensorRT LLM framework.

Deploying LLMs Into Production Using TensorRT LLM

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Matthias Kindt

Unidigital.news | KI & Bildung | Künstliche Intelligenz in Lehre und Unterricht 🤖
8mo
Report this post
Merge Large Language Models with mergekit Model merging is a technique that combines two or more LLMs into a single model. It's a relatively new and experimental method to create new models for cheap (no GPU required). Model merging works surprisingly well and produced many state-of-the-art models on the Open LLM Leaderboard. Blogbeitrag auf HuggingFace https://lnkd.in/eRgEJvmq
Like Comment
To view or add a comment, sign in
Satya Karri

Big Data Solutions Architect | AWS Certified
8mo
Report this post
Simple explanation of RAG from NVIDIA blogs….

What Is Retrieval-Augmented Generation aka RAG?

blogs.nvidia.com
Like Comment
To view or add a comment, sign in
Towards Data Science

635,771 followers
5mo Edited
Report this post
If you'd like to accelerate your model's inference performance, follow along Het Trivedi's solid guide to deploying LLMs into production using Nvidia's open-source TensorRT LLM framework.

Deploying LLMs Into Production Using TensorRT LLM

towardsdatascience.com

1 Comment
Like Comment
To view or add a comment, sign in
Dr. Dzmitry Ashkinadze

NLP Engineer @MDPI | PhD in Bioinformatics @ETH Zürich
2mo Edited
Report this post
If similar to me some of you are trying to fit LLM fine-tuning experiments onto a single consumer GPU have a look at a short and simple article I wrote on this topic #LLM #finetuning #GPU #GPUexperiments #LLMoptimization #singleGPU #machinelearning #deeplearning #neuralnetworks #techarticle #GPUperformance:

vRAM Requirements for LLM Fine-Tuning

medium.com
Like Comment
To view or add a comment, sign in
Towards Data Science

635,771 followers
1mo Edited
Report this post
In a comprehensive technical guide, Chaim Rand zooms in on model-training approaches, and discusses "one of the more advanced optimization techniques — one that sets apart the true rock stars from the simple amateurs — creating a custom PyTorch operator in C++ and CUDA."

Accelerating AI/ML Model Training with Custom Operators

towardsdatascience.com
Like Comment
To view or add a comment, sign in

1,416 followers

1,033 Posts

View Profile Follow

Rubem Didini’s Post

More Relevant Posts

Explore topics