usmanmalik57 12 Junior Poster in Training

Large language models are trained on a fixed corpus, and their knowledge is often limited by the documents they are trained on. Techniques like retrieval augmented generation, continuous pre-training, and fine-tuning enhance an LLM's default knowledge. However, these techniques can still not enable an LLM to answer queries that require web searching.

AI agents use a term tool that allows LLMs to access external knowledge sources such as databases, APIs, and the Internet. To simplify this procedure, OpenAI has introduced a Web Search API that allows you to search the Internet using natural queries through large language models.

You can integrate the Web Search API in Python using the OpenAI Python library. However, most advanced agentic applications in Python are developed using an orchestration framework such as LangGraph. Therefore, it is important to understand the process of integrating the Web Search API in LangGraph.

This article will show you how to integrate the OpenAI Web Search Preview API in a LangGraph node. After reading this article, you can develop LangGraph applications that involve searching the Internet using the OpenAI Web Search Preview API.

So, let's begin without ado.

Importing and Installing Required Libraries

The following script installs the libraries required to run the codes in this article. We will also install the graphviz and libgraphviz libraries to draw the graph diagrams.

# !sudo apt-get update
# !sudo apt-get install -y graphviz libgraphviz-dev

!pip install -U -q \
langgraph langchain-openai langchain-core \
openai pydantic graphviz pygraphviz

The script below …

usmanmalik57 12 Junior Poster in Training

On April 14, 2025, OpenAI released GPT-4.1 — a model touted as the new state-of-the-art, outperforming GPT-4o on all major benchmarks.
As always, I like to evaluate new LLMs on simple tasks like text classification and summarization to see how they compare with current leading models.

In this article, I will share the results I obtained for multi-class and multi-label text classification and text summarization using the OpenAI GPT-4.1 model. So, without further ado, let's begin.

Importing and Installing Required Libraries

The script below installs the Python libraries you need to run codes in this article.

!pip install openai
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The following script imports the required libraries and modules into our Python application.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import combinations
from collections import Counter
from sklearn.metrics import hamming_loss, accuracy_score
from rouge_score import rouge_scorer
from openai import OpenAI

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

Finally, we create the OpenAI client object we will use to call the OpenAI API. To access the API, you will need the OpenAI API key.

client = OpenAI(api_key = OPENAI_API_KEY)
Text Summarization with GPT-4.1

We will first summarize articles in the News Article Summary dataset.

The following script imports the dataset into your application and displays its first five rows.

# https://github.com/reddzzz/DataScience_FP/blob/main/dataset.xlsx


dataset = pd.read_excel(r"/content/summary_dataset.xlsx")
print(dataset.shape)
dataset.head()

Output:

img1.png

The content column contains the article content, whereas the human_summary column contains …

Pelorus_1 -56 Newbie Poster Banned

Through its combination of natural language processing and structured query generation, LangGraph makes interfacing with databases and extracting insights over SQL data easier than ever.

usmanmalik57 12 Junior Poster in Training

In a previous article, I presented a comparison of DeepSeek-R1-Distill-Llama-70b with the DeepSeek-R1-Distill-Qwen-32B for text classification and summarization.

Both these models are distilled versions of the original DeepSeek R1 model. Recently, I wanted to try the original version of the DeepSeek R1 model using the DeepSeek API. However, I was not able to test it because the DeepSeek API was not allowing requesting the model due to high demand. My other option was to access the model via the Hugging Face API, but I could not run that either since it requires very high memory.

Finally, I found a solution via the FireworksAI API. Fireworks AI provides access to the DeepSeek R1 model hosted inside the United States, so you do not have to worry about sending your data to undesired locations.

In this article, you will see a comparison of the original DeepSeek R1 model with Llama 3.1-405B model for text classification and summarization.

So, let's begin without ado.

Importing and Installing Required Libraries

The following script installs the Fireworks Python library and the other libraries required to run scripts in this article.

!pip install --upgrade fireworks-ai
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below imports the required libraries into your Python application.

You will also need the Fireworks API key to access the Fireworks API via the Python library.

from fireworks.client import Fireworks
import os
import pandas as pd
import time
from rouge_score import rouge_scorer …
Pelorus_1 -56 Newbie Poster Banned

Great breakdown of DeepSeek R1 Distill LLaMA 70B! The explanation of text classification and summarization is clear and insightful. Appreciate the practical examples—makes implementation much easier. Thanks for sharing!

policenbicleara -8 Newbie Poster

Llama-70B struggles with sentiment analysis (69% accuracy) vs. Qwen-32B (87%).
Summarization performance is weaker, with lower ROUGE scores.
Qwen-32B is the better choice—smaller, faster, and more accurate.

rproffitt 2,701 https://5calls.org Moderator

"Wiz Research Uncovers Exposed DeepSeek Database Leaking Sensitive Information"
"Security researchers tested 50 well-known jailbreaks against DeepSeek’s popular new AI chatbot. It didn’t stop a single one."

It only seems to get worse the more you look at DeepSeek. And I must note how it is known to not want to talk about Tiananmen Square.

As such I think I will be downvoting DeepSeek R1 tutorials.

usmanmalik57 12 Junior Poster in Training

In the last article, I explained how you can use the DeepSeek-R1-Distill-Qwen-32B model for text classification and summarization problems.

In this article, we will use the DeepSeek-R1-Distill-Llama-70b for the same tasks.

Following results from the DeepSeek-AI's official paper show that DeepSeek-R1-Distill-Llama-70b outperform the other distilled models on 4 out of 6 benchmarks.

Output:

img1a.png

So, let's begin without ado and see what results we achieve with DeepSeek-R1-Distill-Llama-70b model.

Importing and Installing Required Libraries

We will access the DeepSeek-R1-Distill-Llama-70b model via the Groq API. To run scripts in this article, you will need the Groq API Key.

The following script installs the Python Groq library and other Python modules you will need to run scripts for in this article.

!pip install groq
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below imports the required libraries.

The following script

from groq import Groq
import os
import pandas as pd
from rouge_score import rouge_scorer
from sklearn.metrics import accuracy_score
from collections import defaultdict
from google.colab import userdata
Calling DeepSeek R1 Distill Llama 70B using the Groq API

Let's first see how to call the DeepSeek-R1-Distill-Llama-70b via the Groq API.

First, you will need to create an object of the Groq client and pass it your Groq API key.

Next, we define the generate_response() function, which accepts the system instructions and the user prompt.

Finally, you need to call the chat.completions.create() function of the Groq client object to generate a response via …

rproffitt commented: Leaky system. Unsafe to use. -4
RKE2 0 Newbie Poster

Fine-tuning OpenAI Vision Models for visual question-answering is an exciting step forward in AI! It is amazing how these models can combine image recognition with natural language processing to provide accurate, context-aware answers. Cannot wait to see more advancements!

usmanmalik57 12 Junior Poster in Training

DeepSeek-R1 is a groundbreaking family of reinforcement learning (RL)-driven AI models developed by the Chinese AI firm DeepSeek. It is designed to rival industry leaders like OpenAI and Google in complex decision-making and optimization problems.

In this article, we will benchmark the DeepSeek R1 model for text classification and summarization as we did with Qwen and LLama models in a previous article.

So, let's begin without further ado.

Importing and Installing Required Libraries

We will use a distilled version of DeepSeek R1 model from the Hugging Face Inference API. You can use larger versions of DeepSeek models via the DeepSeek API.

The following script installs the required libraries.


!pip install huggingface_hub==0.24.7
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below imports the required libraries into your Python application.

from huggingface_hub import InferenceClient
import os
import pandas as pd
from rouge_score import rouge_scorer
from sklearn.metrics import accuracy_score
from collections import defaultdict
Calling DeepSeek R1 Model Using Hugging Face Inference API

Calling the DeepSeek R1 model via the Hugging Face Inference API is similar to calling any text generation model. You need to create an object of the InferenceClient class and pass it the model ID.

hf_token = os.environ.get('HF_TOKEN')

#deepseek-R1-distill endpoint
#https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
deepseek_model_client = InferenceClient(
    "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
    token=hf_token
)

Let's test the DeepSeek model we just imported. The following script defines the make_prediction() function, which accepts the model object, the system role, and the user query and returns the model response.

usmanmalik57 12 Junior Poster in Training

Open-source LLMs are gaining significant traction due to their ability to match the performance of advanced proprietary LLMs. These models are free to use and allow users to modify their source code or fine-tune them on their own systems, making them highly versatile for various applications.

Alibaba's Qwen and Meta's Llama series are two prominent contenders in the open-source LLM landscape.

In this article, we compare the performance of Qwen 2.5-72b and Llama 3.3-70b, which was released on December 6, 2024. Meta claims that Llama 3.3-70b achieves the same performance as its Llama 3.2 model with 405 billion parameters, making it a powerful and efficient choice for NLP tasks.

By the end of this article, you will understand which model best suits specific NLP needs.

So, let's dive right in!

Installing and Importing Required Libraries

We need to install some libraries to call the Hugging Face inference API to access the Qwen and LLama models. We will also use the rouge-score library to calculate ROUGE scores for text summarization tasks. Below is the script to install the necessary libraries:

!pip install huggingface_hub==0.24.7
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

After installation, import the required libraries as shown below:

from huggingface_hub import InferenceClient
import os
import pandas as pd
from rouge_score import rouge_scorer
from sklearn.metrics import accuracy_score
from collections import defaultdict
Calling Qwen 2.5 and Llama 3.3 Models Using Hugging Face Inference API

To access models using the Hugging Face inference API, you …

sadiaafrin commented: good post +0
usmanmalik57 12 Junior Poster in Training

This tutorial demonstrates how to build an AI agent that queries SQLite databases using natural language. You will see how to leverage the LangGraph framework and the OpenAI GPT-4o model to retrieve natural language answers from an SQLite database, given a natural language query.

So, let's begin without ado.

Importing and Installing Required Libraries

The following script imports the required libraries.


!pip install langchain-community
!pip install langchain-openai
!pip install langgraph

I ran the codes in this article on Google Colab[https://colab.research.google.com/], therefore I didnt have to install any other library.

The script below imports the required libraries into your Python application

import sqlite3
import pandas as pd
import csv
import os

from langchain_community.utilities import SQLDatabase
from langchain_community.tools.sql_database.tool import QuerySQLDataBaseTool
from langchain_openai import ChatOpenAI
from langchain import hub
from langgraph.graph import START, StateGraph

from typing_extensions import Annotated
from IPython.display import Image, display

from google.colab import userdata
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
Creating a Dummy SQLite Database

We will perform question/answering using the SQLite database and LangGraph. The process remains the same for other database types.

We will create a dummy SQLite dataset using the Titanic dataset CSV file from Kaggle.

The following script converts the Titanic-Dataset.csv file into the titanic.db SQLite database.

dataset = pd.read_csv('Titanic-Dataset.csv')
dataset.head()

Output:

img1.png


import pandas as pd
import sqlite3

def csv_to_sqlite(csv_file_path, database_name):

    try:
        # Read the CSV file into a pandas DataFrame with headers
        df = pd.read_csv(csv_file_path)

        # Connect to SQLite database (or create it if it doesn't exist)
        conn …
mansha99 0 Newbie Poster

Awesome

usmanmalik57 12 Junior Poster in Training

On November 20, 2024, OpenAI updated its GPT-4o model, claiming it is more creative and accurate on several benchmarks.

In this article, I compare the GPT-4o November update with the previous version (August update) for text summarization and classification tasks.

By the end of this article, you will see whether the new update outperforms the previous one, particularly for text classification and summarization tasks.

So, let's begin without ado.

Importing and Installing Required Libraries

You must install the OpenAI Python library to access OpenAI models in Python. In addition, you need to install a few other libraries that will help you evaluate OpenAI models for text summarization and classification tasks.


!pip install openai
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The following script imports the required libraries in our Python application.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import combinations
from collections import Counter
from sklearn.metrics import hamming_loss, accuracy_score
from rouge_score import rouge_scorer
from openai import OpenAI

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

We will also define an OpenAI client object to call the OpenAI API.

client = OpenAI(api_key = OPENAI_API_KEY)
Comparison for Text Summarization

Let's first see the results of text summarization. We will use the News Article Summary dataset you can download from Kaggle to summarize the articles.



# Kaggle dataset download link
# https://github.com/reddzzz/DataScience_FP/blob/main/dataset.xlsx


dataset = pd.read_excel(r"/content/dataset.xlsx")
print(dataset.shape)
dataset.head()

Output:

img1.png

The dataset contains news articles and human …

usmanmalik57 12 Junior Poster in Training

In my previous article, I presented a comparison of GPT-4o and Claude 3.5 Sonnet for multi-label text classification. The accuracies achieved by both models were relatively low.

Fine-tuning is one solution to overcome the low performance of large-language models. With fine-tuning, you can incorporate custom domain knowledge into an LLM's weights, leading to better performance on your custom dataset.

This article will show how to fine-tune the OpenAI GPT-4o model on the multi-label research paper classification dataset. It is the same dataset I used for zero-shot multi-label classification in my previous article. You will see significantly better results with the fine-tuned GPT-4o model.

So, let's begin without ado.

Importing and Installing Required Libraries

We will fine-tune the OpenAI GPT-4o model using the OpenAI API in Python. The following script installs the OpenAI Python library.


!pip install openai

The script below imports the required libraries into your Python application.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import combinations
from collections import Counter
from sklearn.metrics import hamming_loss, accuracy_score
import json
import os
from openai import OpenAI
Importing and Preprocessing the Dataset

We will fine-tune the GPT-4o model using the same multi-label research paper classification dataset we used in the last article.

The following script imports the dataset into a Pandas dataframe and displays the dataset header.

## dataset download link
## https://www.kaggle.com/datasets/shivanandmn/multilabel-classification-dataset?select=train.csv

dataset = pd.read_csv(r"D:\Datasets\Multilabel Research Paper Classification\train.csv")
print(f"Dataset Shape: {dataset.shape}")
dataset.head()

Output:

img1.png

The dataset has …

usmanmalik57 12 Junior Poster in Training

In one of my previous articles, you saw a comparison of GPT-4o vs. Claude 3.5 sonnet for zero-shot text classification. In that article; we performed multi-class text classification where input tweets belonged to one of the three categories.

In this article, we will go a step further and perform zero-shot multi-label text classification with GPT-4o and Claude 3.5 sonnet models. We will compare the two models using accuracy and hamming loss criteria and see which model is suited for zero-shot multi-label text classification.

So, let's begin without ado.

Installing and Importing Required Libraries

We will call the Claude 3.5 sonnet and GPT-4o models using the Anthropic and OpenAI Python libraries.

The following script installs these libraries.


!pip install anthropic
!pip install openai

The script below imports the required libraries into your Python application.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import combinations
from collections import Counter
from sklearn.metrics import hamming_loss, accuracy_score

import anthropic
from openai import OpenAI

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
ANTHROPIC_API_KEY = userdata.get('ANTHROPIC_API_KEY')
Importing and Visualizing the Dataset

We will use a multi-label text classification dataset from Kaggle containing research paper titles and abstracts that can belong to one or more of the six output categories: Computer Science, Physics, Mathematics, Statistics, Quantitative Biology, and Quantitative Finance.

The following script imports the training set from the dataset and plots the dataset header.


## dataset download link
## https://www.kaggle.com/datasets/shivanandmn/multilabel-classification-dataset?select=train.csv

dataset = pd.read_csv("/content/train.csv") …
usmanmalik57 12 Junior Poster in Training

On September 25, 2024, Meta released the Llama 3.2 series of multimodal models. The models are lightweight yet extremely powerful for image-to-text and text-to-text tasks.

In this article, you will learn how to use the Llama 3.2 Vision Instruct model for general image analysis, graph analysis, and facial sentiment prediction. You will see how to use the Hugging Face Inference API to call the Llama 3.2 Vision Instruct model.

The results are comparable with the proprietary Claude 3.5 Sonnet model as explained in this article.

So, let's begin without ado.

Importing Required Libraries

We will call the Llama 3.2 Vision Instruct model using the Hugging Face Inference API. To access the API, you need to install the following library.

pip install huggingface_hub==0.24.7

The following script imports the required libraries into your Python application.

import os
import base64
from IPython.display import display, HTML
from IPython.display import Image
from huggingface_hub import InferenceClient
import requests
from PIL import Image
from io import BytesIO
import matplotlib.pyplot as plt
A Basic Image Analysis Example with Llama 3.2 Vision Instruct Model

Let's first see how to analyze an image using the Llama 3.2 vision instruct model using the Hugging Face Inference API.

We will analyze the following image.

image_url = r"https://healthier.stanfordchildrens.org/wp-content/uploads/2021/04/Child-climbing-window-scaled.jpg"
Image(url=image_url, width=600, height=600)

Output:

image-1.png

To analyze an image using the Hugging Face Inference, you must first create an object of the InferenceClient class from the huggingface_hub module. You must pass your Hugging Face access token

Lihui Zhang commented: No understand but impressive... I just learned LeNet-5 +0
usmanmalik57 12 Junior Poster in Training

This article explains how to create a retrieval augmented generation (RAG) chatbot in LangChain using open-source models from Hugging Face serverless inference API.

You will see how to call large language models (LLMs) and embedding models from Hugging Face serverless inference API using LangChain. You will also see how to employ these LLMs and embedding models to create LangChain chatbots with and without memory.

So, let's begin without ado.

Installing and Import Required Libraries

We will first install the Python libraries required to run codes in this article.


!pip install langchain
!pip install langchain_community
!pip install pypdf
!pip install faiss-gpu
!pip install langchain-huggingface
!pip install --upgrade --quiet huggingface_hub

The script below imports the required libraries into your Python application.

Since we will be accessing the Hugging Face serverless inference API, you must obtain your access token from Hugging Face.

Note: The codes in this article are run with Google Colab, so I used the user data.get() method to access environment variables. You must use the method that is appropriate for your environment.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
from langchain_community.embeddings import (
    HuggingFaceInferenceAPIEmbeddings,
)
from langchain_community.vectorstores import FAISS
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_core.documents import Document
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.memory import ChatMessageHistory

import os
from google.colab import userdata

hf_token = userdata.get('HF_API_TOKEN')
Brandon_38 0 Newbie Poster

This post is a treasure trove of inspiration for third-year projects! The diverse range of ideas encourages creativity and exploration in various fields. Whether it’s tech, arts, or social issues, these suggestions can help students find their passion. It's exciting to see such innovative concepts that can lead to meaningful outcomes!

Brandon_38 0 Newbie Poster

The competition between Qwen and Llama in the open-source LLM race is fascinating to follow! Both models bring unique strengths to the table, making it exciting to see which will emerge as the leader. I appreciate the ongoing innovation in this space, as it pushes the boundaries of what's possible with language models. It will be interesting to watch how their developments shape the future of AI!

usmanmalik57 12 Junior Poster in Training

Open-source LLMS, owing to their comparable performance with advanced proprietary LLMs, have been gaining immense popularity lately. Open-source LLMs are free to use, and you can easily modify their source code or fine-tune them on your systems.

Alibaba's Qwen and Meta's Llama series of models are two major players in the open source LLM arena. In this article, we will compare the performance of Qwen 2.5-72b and Llama 3.1-70b models for zero-shot text classification and summarization.

By the end of this article, you will have a rough idea of which model to use for your NLP tasks.

So, lets begin without ado.

Installing and Importing Required Libraries

We will call the Hugging Face inference API to access the Qwen and LLama models. In addition, we will need the rouge-score library to calculate ROUGE scores for text summarization tasks. The script below installs the required libraries for this article.


!pip install huggingface_hub==0.24.7
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below installs the required libraries.


from huggingface_hub import InferenceClient
import os
import pandas as pd
from rouge_score import rouge_scorer
from sklearn.metrics import accuracy_score
from collections import defaultdict
Calling Qwen 2.5 and Llama 3.1 Using Hugging Face Inference API

To access models via the Hugging Face inference API, you will need your Hugging Face User Access tokens.

Next, create a client object for the corresponding model using the InferenceClient class from the huggingface_hub library.
You …

usmanmalik57 12 Junior Poster in Training

In my previous article, I explained how to fine-tune OpenAI GPT-4o model for natural language processing tasks.

In OpenAI DevDay, held on October 1, 2024, OpenAI announced that users can now fine-tune OpenAI vision and multimodal models such as GPT-4o and GPT-4o mini. The best part is that fine-tuning vision models are free until October 31.

In this article, you will see an example of how to vision fine-tune the GPT-4o model on your custom visual question-answering dataset. So, let's begin without ado.

Importing and Installing Required Libraries

You will need to install the OpenAI Python library.

!pip install openai

In this article, we will be using the following Python libraries. Run the following script to import them into your Python application.


from openai import OpenAI
import pandas as pd
import json
import os
from sklearn.utils import shuffle
from sklearn.metrics import accuracy_score
Importing and Preprocessing the Dataset

We will fine-tune the GPT-4o model on a visual question-answering dataset you can download from Kaggle.

The following script imports the CSV file containing the question, the image ID, and the corresponding answer to the question.

#Data download link
#https://www.kaggle.com/datasets/bhavikardeshna/visual-question-answering-computer-vision-nlp

dataset = pd.read_csv(r"D:\Datasets\dataset\data_train.csv")
dataset.head()

Output:

img1.png

Here is the image with the id image100. You can see cups on the shelves.

image100.png

For vision fine-tuning, you must pass image URLs to the OpenAI API. Hence, we will upload our images to a cloud service (Github for this article). The dataset consists of over 1500 …

autowrecking 0 Newbie Poster

Impressive Thanks for sharing.

usmanmalik57 12 Junior Poster in Training

In one of my previous articles, I explained how to generate stunning images for free using diffusion models and showed how to generate Stability AI's diffusion models for text-to-image generation.

Since then, the AI domain has progressed considerably, particularly in image generation. Black Forest Labs has released Flux.1 series of state-of-the-art vision models.

In this article, you will see how to use Flux.1 models for text-to-image generation and text-to-image modification. You will import Flux models from Hugging Face and generate images using Python code.

So, let's begin without ado.

Installing and Importing Required Libraries

Flux models are gated on Hugging Face, meaning you have to log into your account to access Flux models. To do so from a Python application, particularly Jupyter Notebook, you need to download the huggingface_hub module. In addition, you need to download the diffusers module from Hugging Face.

The script below downloads these two modules.


!pip install huggingface_hub
!pip install git+https://github.com/huggingface/diffusers.git

Note: To run scripts in this article, you will need Nvidia GPUs. You can use Google Colab, which provides free Nvidia GPUs.

Next, let's import the required libraries into our Python application:


from huggingface_hub import notebook_login
import torch
import matplotlib.pyplot as plt
from diffusers import FluxPipeline
from diffusers import FluxImg2ImgPipeline
from diffusers.utils import load_image

notebook_login() # you need to log into your hugging face account using access token
Text to Image Generation with Flux

Flux models have two variants: timestep-distilled (FLUX.1-schnell) and guidance-distilled (FLUX.1-dev). The timestep-distilled …

rproffitt 2,701 https://5calls.org Moderator

How is this connected to "Computer Science?"

Also, in the US you would be conversant with The 505(b)(1 and 2) Regulatory Pathways. Since there are about 195 countries in the world, you have a lot of work ahead of your pharma company to navigate those other country regulations.

Kajal_14 0 Newbie Poster

What is the scope of a clinical trials course in the pharmaceutical industry?

usmanmalik57 12 Junior Poster in Training

On September 19, 2024, Alibaba released the Qwen 2.5 series of models. The Qwen 2.5-72B base and instruct models outperformed larger state-of-the-art models like Llama 3.1-405B on multiple benchmarks. It is safe to assume that Qwen 2.5-72B is a state-of-the-art open-source large language model.

This article will show you how to use Qwen 2.5 models in your Python applications. You will see how to import Qwen 2.5 models from the Hugging Face library and generate responses. You will also see how to perform text classification and summarization tasks on custom datasets using the Qwen 2.5-7B. So, let's begin without ado.

Note: If you have a GPU with larger memory, you can also try Qwen 2.5-7B using the scripts provided in this code.

Installing and Importing Required Libraries

You can run the scripts in this article on Google Colab. In this case, you only need to install the following libraries.


!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The following script imports the libraries you need to run scripts in this article.

from transformers import AutoModelForCausalLM, AutoTokenizer
import pandas as pd
from sklearn.metrics import accuracy_score
from rouge_score import rouge_scorer
A Basic Example of Using Qwen 2.5 Instruct Model in Hugging Face

Before moving to text classification and summarization on custom datasets, let's first see how to generate a single response from the Qwen 2.5-7B model.

Importing the Model and Tokenizer from Hugging Face

The first step is to import the model weights and …

usmanmalik57 12 Junior Poster in Training

The AI wave has introduced a myriad of exciting applications. While text generation and natural language processing are leading the AI revolution, image, and vision-based technologies are quickly catching up. The intersection of text and vision applications has seen a rapid surge recently.

In this article, you'll learn how to generate videos using text and image inputs. We'll leverage open-source models from Hugging Face to bring these applications to life. So, without further ado, let's dive in!

Installing and Importing Required Libraries

We will use the Hugging Face diffusion models to generate videos from text and images. The following script installs the libraries you will need to import these models from Hugging Face.

!pip install --upgrade transformers accelerate diffusers imageio-ffmpeg

For text-to-video generation, we will use the CogVideoX-2b diffusion model. For image-to-video generation, we will use the Stability AI's img2vid model.

The following script imports the Hugging Face pipelines for the two models. We also import some utility classes to save videos and display images.


import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video

from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image
Text to Video Generation with Hugging Face Diffusers

The first step is to create a Hugging Face pipeline that can access the CogVideoX-2b model. You can also use the CogVideoX-5b model, but it requires more space and memory.

The following script creates the pipeline for the CogVideoX-2b model. We also call some utility methods such as enable_model_cpu_offload(), enable_sequential_cpu_offload(), …

meyerrluanna 17 Newbie Poster Banned

The post gives a really clear, step-by-step guide on how to fine-tune the OpenAI Whisper model for audio classification. Its great how it covers everything from preparing the dataset to using tools for feature extraction and model training .This is definitely worth checking out, The explanations are practical, and the code snippets make it easy to follow along!

usmanmalik57 12 Junior Poster in Training

Large language models (LLMS) are trained to predict the next token (set of characters) following an input sequence of tokens. This makes LLMs suitable for unstructured textual responses.

However, we often need to extract structured information from unstructured text. With the Python LangChain module, you can extract structured information in a Python Pydantic object.

In this article, you will see how to extract structured information from news articles. You will extract the article's tone, type, country, title, and conclusion. You will also see how to extract structured information from single and multiple text documents.

So, let's begin without ado.

Installing and Importing Required Libraries

As always, we will first install and import the required libraries.
The script below installs the LangChain and LangChain OpenAI libraries. We will extract structured data from the news articles using the OpenAI GPT-4 latest LLM.


!pip install -U langchain
!pip install -qU langchain-openai

Next, we will import the required libraries in a Python application.


import pandas as pd
import os
from typing import List, Optional
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import OpenAI
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
Importing the Dataset

We will extract structured information from the articles in the News Article with Summary dataset.

The following script imports the data into a Pandas DataFrame.


dataset = pd.read_excel(r"D:\Datasets\dataset.xlsx")
dataset.head(10)

Output:

image1.png

Defining the Structured Output Format

To extract structured output, we need to define the attributes of …

Duane_4 0 Newbie Poster

This is an interesting and useful post. I believe there is a typo in the 7th code block relating to train/val/test split. The second line should read:

val_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42)

The whole point of temp_df in the first line was to separate out 30% for val/test; the second line splits that to form 15% val 15% test.

usmanmalik57 12 Junior Poster in Training

Retrieval augmented generation (RAG) allows large language models (LLMs) to answer queries related to the data the models have not seen during training. In my previous article, I explained how to develop RAG systems using the Claude 3.5 Sonnet model.

However, RAG systems only answer queries about the data stored in the vector database. For example, you have a RAG system that answers queries related to financial documents in your database. If you ask it to search the internet for some information, it will not be able to do so.

This is where tools and agents come into play. Tools and agents enable LLMs to retrieve information from various external sources such as the internet, Wikipedia, YouTube, or virtually any Python method implemented as a tool in LangChain.

This article will show you how to enhance the functionalities of your RAG systems using tools and agents in the Python LangChain framework.

So, let's begin without an ado.

Installing and Importing Required Libraries

The following script installs the required libraries, including the Python LangChain framework and its associated modules and the OpenAI client.



!pip install -U langchain
!pip install langchain-core
!pip install langchainhub
!pip install -qU langchain-openai
!pip install pypdf
!pip install faiss-cpu
!pip install --upgrade --quiet  wikipedia
Requirement already satisfied: langchain in c:\us

The script below imports the required libraries into your Python application.


from langchain.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
from langchain.tools.retriever import create_retriever_tool
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_openai import ChatOpenAI, …
usmanmalik57 12 Junior Poster in Training

On August 20, 2024, OpenAI enabled GPT-4o fine-tuning in the OpenAI playground and the OpenAI API. The much-awaited feature is free for fine-tuning 1 million daily tokens until September 23, 2024.

In this article, I will show you how to fine-tune the OpenAI GPT-4o model for text classification and summarization tasks.

It is important to note that in my previous articles I have already demonstrated results obtained for zero-shot text classification and zero-shot text summarization using default GPT-4o model. In this article, you will see that fine-tuning a GPT-4o model improves text classification and text summarization performance significantly.

So, let's begin without an ado.

Installing and Importing Required Libraries

The following script installs the Python libraries you need to run codes in this article.


!pip install openai
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below imports the required libraries into your Python application.


import os
import json
import time
import pandas as pd
from rouge_score import rouge_scorer
from sklearn.metrics import accuracy_score
from openai import OpenAI
Fine-tuning GPT-4o for Text Classification

In a previous article, I explained the process of fine-tuning GPT-4o mini and GPT-3.5 turbo models for zero-shot text classification.

The process remains the same for fine-tuning GPT-4o.
We will first import the text classification dataset, which in this article is the Twitter US Airline Sentiment Dataset.

The following script imports the dataset.


dataset = pd.read_csv(r"D:\Datasets\Tweets.csv")
dataset.head()

Output:

image1.png

Next, …

Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member

Does Newegg still exist? It’s been decades since I’ve bought a computer or computer parts from anyone other than Apple or Dell.

rproffitt commented: Yes, Newegg is still around. That Lenovo laptop I use? From Amazon +17
rproffitt 2,701 https://5calls.org Moderator

For some it's price. Others it will be support and warranty.

And then because of tariffs and more, there would be about 193 countries so the list of such websites would be over SIX HUNDRED.
What country are you shopping in?

Here we can buy direct from Apple and Microsoft who offer some nice enough computers. Then we have Dell and others for server gear.

suwaidi 0 Newbie Poster

What is the best website for buying computers?

usmanmalik57 12 Junior Poster in Training

In a previous article, I compared GPT-4o mini vs. GPT-4o and GPT-3.5 Turbo for zero-shot text summarization. The results showed that the GPT-4o mini achieves almost similar performance for zero-shot text classification at a much-reduced price compared to the other models.

I will compare Meta Llama 3.1 70b with OpenAI GPT-4o snapshot for zero-shot text summarization in this article. Meta Llama 3.1 series consists of Meta's state-of-the-art LLMs, including Llama 3.1 8b, Llama 3.1 70b, and Llama 3.1 405b. On the other hand, [OpenAI GPT-4o[(https://platform.openai.com/docs/models)] snapshot is OpenAIs latest LLM. We will use the Groq API to access Meta Llama 3.1 70b and the OpenAI API to access GPT-4o snapshot model.

So, let's begin without ado.

Installing and Importing Required Libraries

The following script installs the Python libraries you will need to run scripts in this article.


!pip install openai
!pip install groq
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below installs the required libraries into your Python application.


import os
import time
import pandas as pd
from rouge_score import rouge_scorer
from openai import OpenAI
from groq import Groq
Importing the Dataset

This article will summarize the text in the News Articles with Summary dataset. The dataset consists of article content and human-generated summaries.

The following script imports the CSV dataset file into a Pandas DataFrame.


# Kaggle dataset download link
# https://github.com/reddzzz/DataScience_FP/blob/main/dataset.xlsx


dataset = pd.read_excel(r"D:\Datasets\dataset.xlsx")
dataset = dataset.sample(frac=1)
dataset['summary_length'] = dataset['human_summary'].apply(len)
average_length …
usmanmalik57 12 Junior Poster in Training

In my previous articles, I presented a comparison of OpenAI GPT-4o mini model with GPT-4o and GPT-3.5 turbo models for zero-shot text classification. The results showed that GPT-4o mini, while significantly cheaper than its counterparts, achieves comparable performance.

On 8 August 2024, OpenAI enabled GPT-4o mini fine-tuning for developers across usage tiers 1-5. You can now fine-tune GPT-4o mini for free until 23 September 2024, with a daily token limit of 2 million.

In this article, I will show you how to fine-tune the GPT-4o mini for text classification tasks and compare it to the fine-tuned GPT-3.5 turbo.

So, let's begin without ado.

Importing and Installing Required Libraries

The following script installs the OpenAI Python library you can use to make calls to the OpenAI API.


!pip install openai

The script below imports the required liberaries into your Python application.


from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from openai import OpenAI
import pandas as pd
import os
import json
Importing the Dataset

We will use the Twitter US Airline Sentiment dataset for fine-tuning the GPT-4o mini and GPT-3.5 turbo models.

The following script imports the dataset and defines the preprocess_data() function. This function takes in a dataset and an index value as inputs. It then divides the dataset by sentiment category, returning 34, 33, and 33 tweets from each category, beginning at the specified index. This approach ensures we have around 100 balanced records. You can use more number of records for …

usmanmalik57 12 Junior Poster in Training

In my previous article on GPT-4o mini, I compared the performance of GPT-4o mini against GPT-3.5 Turbo and GPT-4o for zero-shot text classification. We saw that GPT-4o mini, being 36% times cheaper, achieves only 2% less accuracy than GPT-4o. Furthermore, while being 1/3 of the price, the GPT-4o mini significantly outperformed the GPT-3.5 turbo model.

This article will compare GPT-4o mini, GPT-4o, and GPT-3.5 turbo for zero-shot text summarization. We will evaluate the models' text summarization capabilities using metrics such as ROUGE scores and LLM-based evaluation.

So, let's begin without ado.

Importing and Installing Required Libraries

You must install the following Python libraries to run code in this article.


!pip install openai
!pip install rouge-score
!pip install --upgrade openpyxl
!pip install pandas openpyxl

The script below imports the required libraries.


import os
import time
import pandas as pd
from rouge_score import rouge_scorer
from openai import OpenAI
Importing the Dataset

We will summarize the articles in the News Articles with Summary dataset. The dataset consists of article content and human-generated summaries.

The following script imports the CSV dataset file into a Pandas DataFrame.



# Kaggle dataset download link
# https://github.com/reddzzz/DataScience_FP/blob/main/dataset.xlsx


dataset = pd.read_excel(r"D:\Datasets\dataset.xlsx")
dataset = dataset.sample(frac=1)
print(dataset.shape)
dataset.head()

Output:

image1.png

The content column contains the article content, while the human_summary column contains human-generated summaries of the article.

Next, we will find the average number of characters in all summaries. We will use this number to summarize articles using LLMs.


dataset['summary_length'] …
usmanmalik57 12 Junior Poster in Training

On July 18th, 2024, OpenAI released GPT-4o mini, their most cost-efficient small model. GPT-4o mini is around 60% cheaper than GPT-3.5 Turbo and around 97% cheaper than GPT-4o. As per OpenAI, GPT-4o mini outperforms GPT-3.5 Turbo on almost all benchmarks while being cheaper.

In this article, we will compare the cost, performance, and latency of GPT-4o mini with GPT-3.5 turbo and GPT-4o. We will perform a zero-shot tweet sentiment classification task to compare the models. By the end of this article, you will find out which of the three models is better for your use cases. So, let's begin without ado.

Importing and Installing Required Libraries

As a first step, we will install and import the required libraries.

Run the following script to install the OpenAI library.


!pip install openai

The following script imports the required libraries into your application.


import os
import time
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from openai import OpenAI
Importing and Preprocessing the Dataset

To compare the models, we will perform zero-shot classification on the Twitter US Airline Sentiment dataset, which you can download from kaggle.

The following script imports the dataset from a CSV file into a Pandas dataframe.


## Dataset download link
## https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment?select=Tweets.csv

dataset = pd.read_csv(r"D:\Datasets\tweets.csv")
print(dataset.shape)
dataset.head()

Output:

image1.png

The dataset contains more than 14 thousand records. However, we will randomly select 100 records. Of these, 34, 33, and 33 will have neutral, positive, …

Madisson -4 Newbie Poster

If Moore's Law is in Danger, What's Ahead for Other Famous Laws?" explores the potential impact on other technological principles if Moore's Law, which predicts the doubling of transistors on a microchip approximately every two years, faces challenges. This discussion includes examining the future of laws like Metcalfe's Law (network value) and Kryder's Law (storage density), considering how advancements in quantum computing, AI, and alternative materials could redefine these principles in the rapidly evolving tech landscape.

Salem commented: 15 years too late - spamboy! -4
rproffitt 2,701 https://5calls.org Moderator

To VAIBHAV. Recently an instructor at some training site was posting similar questions. Turns out they were directing folk to their course on Udemy. They were found out with a little research by others.

Pelorus_1 -56 Newbie Poster Banned

Learning ethical hacking is a valuable skill. Start by considering online courses and certifications such as Certified Ethical Hacker (CEH) or Offensive Security Certified Professional (OSCP). There are also many resources and communities online where you can learn and practice ethical hacking techniques.

VAIBHAV_20 commented: "SKILL" +0
rproffitt 2,701 https://5calls.org Moderator

To VAIBHAV. I used your business card information to see if you could be found at SRM.

VAIBHAV_20 commented: But why are you trying to find me +0
Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators. Moderator Featured Poster

Why is it your job to determine it?

It isn't my job to determine it but I think it is a fair question.

The only thing that is against our community rules is to not specifically ask for help to do something illegal

I did not flag the post as violating any rules. I also did not down-vote the post. I wasn't trying to suggest that the intentions were illegal, and there is no way to determine the OP's actual intentions, but if someone were to ask me "where can I learn how to pick locks", I would ask the same question. Hacking skills are a dangerous tool set likely far more often used to bad ends rather than good.

rproffitt commented: Sometimes the only thing more dangerous than a question is an answer. +17
rproffitt 2,701 https://5calls.org Moderator

Are you attending "SRM Institute of Science and Technology"?

VAIBHAV_20 commented: How do you know +0
Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member

rproffit, are you referring to this or this? Or this? (All 3 of which I was only able to find after typing in your exact Google query. "Vaibhav Jaiswal" came up with nothing.)

We are not psychics. We can't guess what search query you would use nor which search result you would click on. If you're going to make a recommendation to read something, can you please post a link to what that is?

rproffitt commented: I wonder if this is OP's own article? +0
VAIBHAV_20 commented: why do you think soo? that it's his or not +0
Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member

Why is it your job to determine it?

The only thing that is against our community rules is to not specifically ask for help to do something illegal. He posed in his own question that he wants to learn ethical hacking, which is not illegal. Why should his very first post in this community be met with having to go out of his way to prove why he's not lying?

rproffitt commented: vaibhav jaiswal ethical hacking found it for me. +17
Reverend Jim 5,242 Hi, I'm Jim, one of DaniWeb's moderators. Moderator Featured Poster

How are we to determine whether you want to learn hacking for ethical, or for non-ethical use?

VAIBHAV_20 commented: Yeah you are right . I want to be unethical but one thing I can make sure that I won’t do be doing anything wrong that would hurt anyone +0
rproffitt 2,701 https://5calls.org Moderator

Have you read Vaibhav Jaiswal's article on this?

Dani commented: How are we supposed to know what article you're referring to? A google search turns up nothing -8
VAIBHAV_20 commented: No I’ll read it +0