Skip to content

Welcome to Tech Gallery, a centralized developer repository where I showcase a selection of my projects. Let's connect and innovate together!

Notifications You must be signed in to change notification settings

Rizi2001/TechGallery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TechGallery

Welcome to Tech Gallery, a centralized developer repository where I showcase a selection of my projects. Let's connect and innovate together!

Repository Structure

Tech Gallery
│
├──Project Folder
│	├──Code/Notebook
│	├──... 
│
├──Project Folder 
│	├──Code/Notebook 
│	├──... 
│	├──... 
├──... 
│	├── ...

Project Categories

  • [BERT Text Classification Model Fine-Tuning for Q&A Chat Bot]

    • Fine-Tuned BERT (distilbert-base-uncased) model for text classification. The data consisted of 500 questions and its answers. 5 variations of each question were generated and assigned to a class. The total data having more than 3000 questions was divided into 650+ classes. This data was used to fine-tune and validate the BERT model to classify the question to a perticular class, returning the answer of that class would complete the Chat Bot functionality.
  • [Bitcoin Prediction Using LDA Topic Modelling]

    • Bitcoin price prediction using topic modelling on news to cater in the affect of news sentiment.
  • [Chat Bot TextEmbeddings-Gecko@001 Model]

    • Flask Application Endpoint to use the chatbot on the front end. It uses a csv file containing the questions, answers and its generated embeddings which can be used for similarity comparison between the user input. It also stores logs for user input in the static directory.
    • Chat Bot using Google's Gecko@001 Model to generate Text Embeddings and using cosine similarity, built a Chat Bot.
    • Chat Bot 2.0 version using Google's Gecko@001 Model to generate Text Embeddings and using cosine similarity, suggests with related questions to questions which are not answered due to having less confidence score.
    • Evaluation mechanism for the chat bot by automatically generating test questions (variations of the original data) and giving the Chatbot's Response, Ground Truths and the accuracy score.
    • Chat-Bot Embedding Data Updation notebook to incrementally increase the Chat-Bot's performance by updating its embedding dataset with updated embeddings of questions user asked but the chat bot didn't answer.
  • [Comparision of Whisper & Faster-Whisper STT]

    • Notebook for Inference comparsion between OPEN AI's Whisper Large-V2 model and SYSTRAN's Faster-Whisper Large-V2 model, run locally.
  • [DSPy - Home Remedy Healthcare Assistant]

    • Used DSPy to build an AI-driven health assistant with Zero-shot learning, RAG (Retrieval Augmented Generation), CoT (Chain of Thought) reasoning and Fine-Tuning making it to handle the health-related questions and provide home remedies for simple problems.git
  • [Faster_Whisper_STT_DJango_Service]

    • Django Service/API which receives an audio (MP3/wav) with key 'file' and trasncribes the given audio file using SYSTRAN's Faster-Whisper. It stores the received input audio in the /uploaded_files directory. It returns the trasncription in string format.
  • [Google Cloud Platform]

    • AI Doctor Vertex AI Chat-Bison Flask APP
      • A simple chat model integrating GCP's Vertex AI Chat-Bison/Text-Bison LLM API with fixed guide prompt with user input to suggest home remedies for medical symptoms
    • Enterprise Search Application
      • Using Google Cloud's GenApp Builder Enterprise Search, made an application (FLASK) and tested a search engine on custom data.
    • Healthcare Natural Language API Application
      • Using Google Cloud's Healthcare API, developed a FLASK endpoint to generate medical codes (ICD10, RxNorm, UMLs etc.)
    • GCP Trancribe Streaming Audio & Infinite Transciption
      • Using Google Cloud's Speech API, infinite real-time speech to text transcription of streaming audio via microphone using session management.
      • Using Google Cloud's Speech API, real-time speech to text transcription of streaming audio via microphone for 305 seconds.
  • [Lang-Chain Vertex-AI Bigquery]

    • Utilizing Lang-Chain (Text to SQL) for Big-query data using Vertex AI's Text-Bison Language Model.
  • [Machine Learning Notebooks]

    • Small Diverse ML Projects and algorithms implementation.
  • [Numpy, Pandas, Matplotlib]

    • Data Visulization Notebooks and Guides.
  • [OpenAI Image generation API]

    • Interesting image generations with prompting the DALL E model
  • [Roman Urdu Natural Language Processing]

    • Poetry Generation, Noisy Channel Spell Correction, and Transliteration. All in either Urdu or Roman Urdu.
  • [University Projects C & C++]

    • Initial univeristy semester projects and assignments in C & C++.
  • [Web Dev PHP]

    • Basic Gym Management System using Oracle SQL Server and PHP
  • [Web Scraping]

    • Daraz.pk website's crawler collecting information about any given product
    • Google Images Crawler/scrapper to get Google Images URLS and download to view them. Using Selenium, Bs4 and SERP API (Crawling tool)
    • MorningStar Financials web crawler to generate daily market reports
  • [Whisper Speech Model]

    • OPUS to MP3 file format converter
    • Whisper_Base_Model_Language_Detection Notebook using log-Mel spectrogram
    • Whisper model Speech to Text (Multiple models)
    • Sample Audio Mp3 file

Feel free to explore each category to discover more about my projects and my journey as a data scientist.