Introduction
The pursuit of Artificial General Intelligence (AGI) continues to push the boundaries of what machines can do. However, as highlighted in the recent OSWORLD whitepaper, current AI systems still face significant challenges. There is a stark performance gap where humans succeed in over 72% of tasks compared to a mere 12% for the best AI models. Furthermore, AI agents struggle with complex task execution, often failing to integrate multiple applications and interfaces effectively. One case is curating AI agent clusters, which handle smaller pieces of the workload. Perhaps, more practically, you can copy and paste from one tool to another.
The Indispensable Human Element
Despite these limitations, AI tools are incredibly potent when paired with human oversight. We arenโt at the phase where we can 100% rely on language models to do everything for us. We may never be. I often hear about how much time people save using AI, only to waste just as much time trying to get their result perfect. If AI can speed up your workflow even a little bit, take the gain and use your brain for the rest.
On June 4, 2024, ChatGPT, Claude and Perplexity all went down at the same time. Many expressed discomfort, as our new-new workflows were being disrupted. I felt this tooโฆ was the world ending? Fortunately, I know how to run locally with llama3, Gemini was working when I needed it, and I havenโt forgotten how to do my job โthe old fashioned wayโ.
Service was restored by mid afternoon. Thing is, humans are part of the process, and that is a good thing! We are augmented, but not deprecated. Here are three ways I personally bridge the gap between AI capabilities and practical applications, ensuring that these tools deliver their full potential:
Business Development
In my Fearless role, I leverage AI to augment and refine our business development strategies. I use AI-driven chatbots to efficiently organize my thoughts around the crucial elements of our proposalsโsummarizing key points, identifying stakeholders, and clarifying objectives. While AI sifts through massive datasets to provide initial insights, my expertise and judgment steer these tools to produce actionable and compliant frameworks. This personal integration of AI into my workflow ensures adherence to project requirements and enhances the compelling nature of our responses.
Coding and Debugging
AI significantly accelerates my software development workflow. Tools like ChatGPT assist in debugging and learning new programming concepts. While AI can suggest code optimizations and identify errors, I integrate these suggestions, understanding the broader implications of the project. This interaction exemplifies how AI facilitates rapid learning and problem-solving, yet my judgment finalizes the implementation.
Creative Processes
My artistic endeavors also benefit from AI, where I transform inspiration into visual art and media. The process starts with finding inspiration through social media for a creative prompt, followed by using MidJourney to create an image. I animate this image using RunwayML, refine the accompanying script with Grammarly AI, add a voice using ElevenLabs, and compile everything into a video with transcripts using Descript. In this process, I act as my own Zapier, integrating these diverse toolsโeach a "Lego piece"โto craft a cohesive and impactful piece of content.
For instance, I used MidJourney's capabilities to create a series of double-exposure portraits inspired by a prompt about a double-exposure portrait. These images blend the silhouette of a designer with elements of a futuristic, ecologically-focused workspace styled as vibrant digital art on a clean, white background. This method specifies detailed attributes in MidJourney to ensure the AI accurately captures the envisioned aesthetic.
The process from inspiration to execution involves:
Searching for Inspiration: I browse social media hashtags like #PromptShare on platforms like Twitter to find creative prompts posted by other artists and creators.
AI-Powered Creation: I employ MidJourney to generate artwork using the selected prompt. This involves entering the prompt into MidJourneyโs platform, where I refine and iterate on the AIโs outputs to align closely with my creative vision.
Finalizing Artwork: The resulting images are then reviewed and refined based on their alignment with the initial inspiration, and current need.
Bringing it to life: As mentioned above, maybe this is where I take it to a tool like RunwayML to bring the image to life, and add motion. Maybe I use ElevenLabs to add a voice. The possibilities are endless.
By integrating these diverse toolsโeach treated as a "Lego piece"โI craft cohesive and impactful pieces of content, showcasing how AI can speed up the creative process and enhance the depth and quality of artistic expression. Sometimes itโs rapid concept art, other times it is good enough for full production. I decide, because I am the glue.
Current Strengths of AI Agents
Interestingly, the OSWORLD article and whitepaper also highlight areas where AI agents excel. These include performing repetitive tasks with precision and processing structured data effectively. Such capabilities make AI invaluable for specific functions that demand consistency and accuracy, underscoring the utility of these systems in complementing human efforts.
Not all AI tool suggestions are acceptable. Using Grammarly ai, I often correct grammar in my work, and itโs not always what I need to tell the story my way. For example, while writing this article, I chose to ignore this suggestion:
Exploring Advanced AI Tools
For those looking to delve deeper into AI integration, platforms like Google's Vertex AI Agent Builder, Amazon Bedrock, and OpenAI's GPT offer robust environments for developing and deploying AI agents. These tools provide frameworks that accommodate extensive customization and fine-tuning, empowering users to mold AI agents according to specific needs and challenges.
Conclusion
The synthesis of human expertise with AI tools is more than just a partnership; it is a profound integration where human insight directs and enhances the function of AI. As the OSWORLD study outlines, the necessity for human oversight cannot be understated. AI agents require our intervention to achieve high levels of effectiveness, especially in complex and nuanced environments. By continuing to guide AI with our expertise, we ensure that these tools perform optimally and act in relevant and beneficial ways to our objectives.
As we embrace these advanced tools, we must remain the guiding force, ensuring our messages are authentic and useful. Our workflows are accelerated today and will continue to be accelerated in the future, but for now, we are the glue that binds these tools together. Interested in more content like this? Check out โCurating Agent Clusters,โ and consider subscribing.