we enable the AI to learn from provided texts, allowing it to automatically write code and execute tasks to control myCobot.
What's AI Agent?
An AI Agent is an intelligent entity capable of perceiving its environment, making decisions, and executing actions. Its functionality is based on large language model (LLM). However, unlike directly conversing with an LLM, an AI Agent can independently think, utilize tools, and gradually complete given tasks. Depending on the developer's skill level, it can accomplish various specialized tasks.
35d95b2c-d56b-48cf-996c-6b94f60aceb0-image.png
In this case study, we will build a simple single-agent system based on the DeepSeek large language model. The execution logic is straightforward and consists of the following steps: Definition + Observation + Thinking + Action + Memory.
Since an offline LLM lacks internet retrieval capabilities, it requires data input for learning. This ensures that whenever we activate the Agent, it is already prepared to assume its designated role and has sufficient knowledge to answer our queries. During usage, it will record user-approved responses, store them in a database, and continue learning from them.
myCobot 280 M5 Stack
The myCobot 280 series, created by Elephant Robotcs, represent a line of 6 DOF collaborative robot arms designed primarily for personal DIY projects, education and research applications. The myCobot 280 , equipped with a M5 Stack as its control board, it has full Python API for easily control, which is designed to be user-friendly and easy for beginners to learn and use.
b6deafa2-f0bf-4e6f-af60-2405a04ab10b-image.png
Project Setup
1. Provide Knowledge Input
To enable the Agent to function effectively, we need to create a knowledge base and input relevant information.
We save the following information as separate DOCX files:
● Introduction to myCobot
● Technical details of 6 DOF collaborative robotic arms
● Usage instructions for the pymycobot API function library
(These resources can be found on myCobot’s GitBook.)
For example :
6fd97fb0-81d8-4fc4-b28a-931de304b14e-image.png
Save the reference text as '.docx' documents.
2. Load the DeepSeek Model
Before that, you need to purchase your own API_key from the official website of DeepSeek.
Then, we need to load the knowledge into the DeepSeek model via code.
import os
from docx import Document
from openai import OpenAI
def extract_text_from_word(doc_path):
"""Extract From Word (.docx) """
doc = Document(doc_path)
return "\n".join([para.text for para in doc.paragraphs])
def load_local_documents(directory):
"""Read Word """
texts = []
for filename in os.listdir(directory):
if filename.endswith(".docx"):
file_path = os.path.join(directory, filename)
text = extract_text_from_word(file_path)
texts.append(text)
return texts
word_documents = load_local_documents("E:\MyCode\Agent_Deepseek\RobotData")
context = "\n".join(word_documents) # Merge all text
client = OpenAI(
api_key="xxxx {Your API}",
base_url="https://api.deepseek.com"
)
query = ""
completion = client.chat.completions.create(
model="deepseek-chat",
temperature=0.6,
messages=[
{"role": "system", "content": "You are mainly researching Python tasks for collaborative robotic arms. You are familiar with and proficient in the Python language, and can utilize the 'pymycobot' robotic API interface to provide a complete Python code that can be used."},
{"role": "user", "content": f" Word Reference:\n{context}\n\n:{query}"}
]
)
print(completion.choices[0].message.content)
After executing the code, the DeepSeek model will generate a complete myCobot example script based on our input.
3. Output Formatting
LLM-generated output is typically presented as a continuous text stream, which cannot be directly executed as a program.
To allow the Agent to achieve our desired result as automatically saving AI-generated code as a '.py' file and executing it to control the robot, we must format the output properly.
We can use the method of regular expressions to extract Python code from the response of DeepSeek model and save it as a file.
# Use a regular expression to match Python code blocks
code_pattern = r"python(.*?)" # Match Python code blocks
matches = re.findall(code_pattern, message_content,re.DOTALL)
# If code blocks are found, extract them
if matches:
python_code = "\n".join(matches).strip()
else:
# If no markdown code block is found, try to match plain Python code
python_code = message_content.strip()
# Specify the Python file path
file_path = "generated_script.py"
# Write the extracted Python code to a file
with open(file_path, "w", encoding="utf-8") as f:
f.write(python_code)
print(f"Python code has been saved to {file_path}")
4. Execute the Script Automatically
To enable automatic execution, we need to call the system terminal and run the generated script.
def execute_command(command):
"""Executes a shell command and returns stdout and stderr."""
try:
result = subprocess.run(command, shell=True, capture_output=True, text=True)
return result.stdout, result.stderr
except Exception as e:
return None, str(e)
def run_command():
"""Runs the specific command and prints the output."""
command = "conda activate base && python generated_script.py "
print(f"\nRunning command: {command}")
stdout, stderr = execute_command(command)
if stdout:
print(f"\nOutput:\n{stdout}")
if stderr:
print(f"\nError:\n{stderr}")
# run command
run_command()
5. Test with Robot
By adding a while True: loop to the execution process, we can continuously run tasks.
At this point, we have successfully built a simple AI Agent to control the myCobot robot. We can now connect the robot and test its functionality.
Code
import os
from docx import Document
from openai import OpenAI
import subprocess
import re
def extract_text_from_word(doc_path):
"""Extract From Word (.docx) """
doc = Document(doc_path)
return "\n".join([para.text for para in doc.paragraphs])
def load_local_documents(directory):
"""Read Word """
texts = []
for filename in os.listdir(directory):
if filename.endswith(".docx"):
file_path = os.path.join(directory, filename)
text = extract_text_from_word(file_path)
texts.append(text)
return texts
def execute_command(command):
"""Executes a shell command and returns stdout and stderr."""
try:
result = subprocess.run(command, shell=True, capture_output=True, text=True)
return result.stdout, result.stderr
except Exception as e:
return None, str(e)
def run_command():
"""Runs the specific command and prints the output."""
command = "conda activate base && E: && cd E:\MyCode\Agent_Deepseek\RobotData && python generated_script.py"
#print(f"\nRunning command: {command}")
stdout, stderr = execute_command(command)
if stdout:
print(f"\nOutput:\n{stdout}")
if stderr:
print(f"\nError:\n{stderr}")
word_documents = load_local_documents("E:\MyCode\Agent_Deepseek\RobotData")
context = "\n".join(word_documents) # Merge all text
client = OpenAI(
api_key="xxxx {Your API}",
base_url="https://api.deepseek.com"
)
while True:
# Get user input
query = input("\n input command ( 'exit' ):")
if query.lower() == "exit":
print("exit")
break
completion = client.chat.completions.create(
model="deepseek-chat",
temperature=0.6,
messages=[
{"role": "system", "content": "You are mainly researching Python tasks for collaborative robotic arms. You are familiar with and proficient in the Python language, and can utilize the 'pymycobot' robotic API interface to provide a complete Python code that can be used."},
{"role": "user", "content": f"reference text:\n{context}\n\n:{query},generate Python Script"}
]
)
# Extract the generated Python code
message_content = completion.choices[0].message.content
code_pattern = r"```python(.*?)```" # Extract code between ```python ... ```
matches = re.findall(code_pattern, message_content, re.DOTALL)
if matches:
python_code = "\n".join(matches).strip()
else:
python_code = message_content.strip()
# Save the extracted Python code to a file
file_path = "E:\\MyCode\\Agent_Deepseek\\RobotData\\generated_script.py"
with open(file_path, "w", encoding="utf-8") as f:
f.write(python_code)
print(f" running..... ")
# Run the generated script
run_command()
c6268b78-cb74-4143-bc30-241370e667ea-image.png
0e642d0c-32c2-4149-8e04-d716a13bbaf2-image.png
Summary
By building a simple AI agent to control the 6-axis cobot myCobot 280 Pi, we have learned how to create a basic LLM-driven robotics application.
Since this version of agent task does not include additional vision models, speech models, or sensors, it can only perform simple actions. However, if developers integrate vision, speech processing, and sensors, the AI Agent can autonomously complete more complex tasks.