Upsonic/gpt-computer-assistant
Fork: 518 Star: 5660 (更新于 2024-12-23 01:46:14)
license: MIT
Language: Python .
Dockerized Computer Use Agents with Production Ready API’s - MCP Client for Langchain - GCA
最后发布版本: v0.22.3 ( 2024-08-15 02:28:44)
What is GCA?
Hi, this is an open source framework to build vertical AI agent. We just support many llms and new technologies like mcp. You can build your own vertical ai agent army in few commands with the stucturized API.
Playground of GCA | NEW
With playground.gca.dev you are ready to test and create your own strategies for creating an Vertical AI Agent.
- Playground sessions limited to 10 minute.
GPT Computer Assistant(GCA)
GCA is an AI agent framework designed to make computer use across Windows, macOS, and Ubuntu. GCA enables you to replace repetitive, small-logic-based tasks worker to an AI. There is an really important potential that we believe. Whether you’re a developer, analyst, or IT professional, GCA can empower you to accomplish more in less time.
Imagine this:
- Extract the tech stacks of xxx Company | Sales Development Representer
- Identify Relevant tables for Analysis for xxx | Data Analytics
- Check the logs to find core cause of this incident | Technical Support Engineer
- Making CloudFlare Security Settings | Security Specialist
These examples shows how GCA is realize the concept of Vertical AI Agents solutions that not only replicate human tasks, GCA also in the beyond of human speed at same cases.
How GCA Works?
GCA is a Python-based project that runs on multiple operating systems, including Windows, macOS, and Ubuntu. It integrates external concepts, like the Model Context Protocol (MCP), along with its own modules, to interact with and control a computer efficiently. The system performs both routine and advanced tasks by mimicking human-like actions and applying computational precision.
1. Human-like Actions:
GCA can replicate common user actions, such as:
- Clicking: Interact with buttons or other UI elements.
- Reading: Recognize and interpret text on the screen.
- Scrolling: Navigate through documents or web pages.
- Typing: Enter text into forms or other input fields.
2. Advanced Capabilities:
Through MCP and GCA’s own modules, it achieves tasks that go beyond standard human interaction, such as:
- Updating dependencies of a project in seconds.
- Analyzing entire database tables to locate specific data almost instantly.
- Automating cloud security configurations with minimal input.
Prequisites
- Python 3.10
Using GCA.dev Cloud
Installation
pip install gpt-computer-assistant
Single Instance:
from gpt_computer_assistant import cloud
# Starting instance
instance = cloud.instance()
# Show Screenshot
instance.current_screenshot()
# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)
instance.close()
Self-Hosted GCA Server
Docker
Pulling Image
- If you are using ARM computer like M Chipset macbooks you should use ARM64 at the end.
docker pull upsonic/gca_docker_ubuntu:dev0-AMD64
Starting container
docker run -d -p 5901:5901 -p 7541:7541 upsonic/gca_docker_ubuntu:dev0-AMD64
LLM Settings&Using
from gpt_computer_assistant import docker
# Starting instance
instance = docker.instance("http://localhost:7541/")
# Connecting to OpenAI and Anthropic
instance.client.save_models("gpt-4o")
instance.client.save_openai_api_key("sk-**")
instance.client.save_anthropic_api_key("sk-**")
# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)
instance.close()
Local
Installation
pip install 'gpt-computer-assistant[base]'
pip install 'gpt-computer-assistant[api]'
LLM Settings&Using
from gpt_computer_assistant import local
# Starting instance
instance = local.instance()
# Connecting to OpenAI and Anthropic
instance.client.save_models("gpt-4o")
instance.client.save_openai_api_key("sk-**")
instance.client.save_anthropic_api_key("sk-**")
# Asking and getting result
result = instance.request("Extract the tech stacks of gpt-computer-assitant Company", "i want a list")
print(result)
instance.close()
Adding Custom MCP Server to GCA
instance.client.add_mcp_server("websearch", "npx", ["-y", "@mzxrai/mcp-webresearch"])
Roadmap
Feature | Status | Target Release |
---|---|---|
Clear Chat History | Completed | Q2 2024 |
Long Audios Support (Split 20mb) | Completed | Q2 2024 |
Text Inputs | Completed | Q2 2024 |
Just Text Mode (Mute Speech) | Completed | Q2 2024 |
Added profiles (Different Chats) | Completed | Q2 2024 |
More Feedback About Assistant Status | Completed | Q2 2024 |
Local Model Vision and Text (With Ollama, and vision models) | Completed | Q2 2024 |
Our Customizable Agent Infrastructure | Completed | Q2 2024 |
Supporting Groq Models | Completed | Q2 2024 |
Adding Custom Tools | Completed | Q2 2024 |
Click on something on the screen (text and icon) | Completed | Q2 2024 |
New UI | Completed | Q2 2024 |
Native Applications, exe, dmg | Completed | Q3 2024 |
Collaborated Speaking Different Voice Models on long responses. | Completed | Q2 2024 |
Auto Stop Recording, when you complate talking | Completed | Q2 2024 |
Wakeup Word | Completed | Q2 2024 |
Continuously Conversations | Completed | Q2 2024 |
Adding more capability on device | Completed | Q2 2024 |
Local TTS | Completed | Q3 2024 |
Local STT | Completed | Q3 2024 |
Tray Menu | Completed | Q3 2024 |
New Line (Shift + Enter) | Completed | Q4 2024 |
Copy Pasting Text Compatibility | Completed | Q4 2024 |
Global Hotkey | On the way | Q3 2024 |
DeepFace Integration (Facial Recognition) | Planned | Q3 2024 |
Capabilities
At this time we have many infrastructure elements. We just aim to provide whole things that already in ChatGPT app.
Capability | Status |
---|---|
Local LLM with Vision (Ollama) | OK |
Local text-to-speech | OK |
Local speech-to-text | OK |
Screen Read | OK |
Click to and Text or Icon in the screen | OK |
Move to and Text or Icon in the screen | OK |
Typing Something | OK |
Pressing to Any Key | OK |
Scrolling | OK |
Microphone | OK |
System Audio | OK |
Memory | OK |
Open and Close App | OK |
Open a URL | OK |
Clipboard | OK |
Search Engines | OK |
Writing and running Python | OK |
Writing and running SH | OK |
Using your Telegram Account | OK |
Knowledge Management | OK |
Add more tool | ? |
Predefined Agents
If you enable it your assistant will work with these teams:
Team Name | Status |
---|---|
search_on_internet_and_report_team | OK |
generate_code_with_aim_team_ | OK |
Add your own one | ? |
Contributors
最近版本更新:(数据更新于 2024-09-09 00:04:13)
2024-08-15 02:28:44 v0.22.3
2024-08-15 02:12:31 v0.22.2
2024-08-11 05:16:54 v0.22.1
2024-08-10 18:32:14 v0.22.0
2024-07-24 04:59:42 v0.21.1
2024-07-24 01:28:21 v0.21.0
2024-07-14 21:01:07 v0.20.0
2024-06-21 06:04:20 v0.19.1
2024-06-21 03:18:25 v0.19.0
2024-06-19 02:43:34 v0.18.2
主题(topics):
assistant, chatgpt, chatgpt-app, claude, computer-use, gca, gpt, gpt-4o, langchain, linux, macos, mcp, mcp-client, mcp-server, model-context-protocol, openai, ubuntu, windows
Upsonic/gpt-computer-assistant同语言 Python最近更新仓库
2025-01-18 21:26:31 sunnypilot/sunnypilot
2025-01-17 23:34:10 Skyvern-AI/skyvern
2025-01-17 19:49:33 ultralytics/ultralytics
2025-01-17 19:12:03 XiaoMi/ha_xiaomi_home
2025-01-17 08:27:45 comfyanonymous/ComfyUI
2025-01-17 04:56:19 QuivrHQ/MegaParse