Mini Case Study and Reflection

Mini Case Study

Client Context

The client is a company with 200 employees that wants to use AI to come up with ideas of how to run it operations. The company wants to use it to boost dedication and commitment of employees and improve communication. It aims to make employees more productive and increase their satisfaction. The leadership wants select a tool that will match the goals of the company.

Approach

I began the experiment by writing one prompt for different AI tools to ensure they were being judged on the same request. Giving the instruction to four separate programs allowed a comparison of the different personalities and skills of the tools. The tools had to investigate why employees at the company were feeling less motivated and connected to their work over the last year. The AI tools were to act as consultants by first identifying the root of the problem, the proposing specific ways to fix it. Each tool was required to explain why a suggestion would work, what could go wrong, and how the company would know it was successful. The resulting text was carefully examined to see if the advice could be used by the company.

Findings

Although the tools were all tested equally, none provides a complete answer that satisfied all requirements. The most impressive was Claude, which identified emotional problems like lack of safety and trust among workers. Even with this deep thinking, Claude did not provide clear solutions that could be applied immediately. Copilot presented neat and organized steps that looked ready for a board meeting. Apart from the polished look, it only provided safe answers that could have been guessed by anyone. ChatGPT struggled to identify which tasks were most important to do first, even though it was helpful all round. Gemini remained shallow and failed to warn the company about risks or provide metrics for measuring success. All the tools went back to the same old HR solutions like surveys, without explaining how these solutions would fix the lack of motivation in the company. They also focused on the leadership as the causes of the problem without considering other stressors. Human input was needed to make sure the solutions dealt with the causes of the crisis.

Recommendations

AI help a company to explore problems, but humans should always make the decisions. It is a tool for examining the issues that exist but not providing solutions. Even though AI tools present ideas instantly, a human must ascertain whether they are right for the situation. One tool is not perfect for everything and the personality of each must be matched to the specific task. Claude can explore the deep reasons why employees are unhappy. The Copilot can be used to present the information professionally. The prompts must be strong across the tools.

Conclusion

AI saves time. The company can use it to get a first draft or a list of ideas quickly. Nonetheless, AI cannot make the final decisions. A human must look at its suggestions and decide if it is useful or safe. The tools sort information and put it together in a neat structure in just a few seconds. Yet, the company must combine the best parts of each and add human experience. By following this, the final plan will work for the company and its employees.

Analytical Reflection

I performed a test by giving the exact same business problem to several AI programs. In this exercise, I noticed where each tool was helpful and where it gave poor advice. My objective was to measure the quality of answers to determine which tasks are safe to give an AI and which tasks still need a human to handle.

Prompt Design

My first mistake was a general prompt that did not give the AI enough details. The tools needed precise boundaries to work with. The answers just explained tasks that anyone could have guessed without using AI. I was not satisfied with these results. I then added a strict requirement for the AI to think harder but all the tools still struggled to be truly original. They continued to suggest the same basic solutions. The tools would suggest solutions like a survey without explaining how this method would fix an issue like burnout. They results gave a “What” but were still struggling to explain the “Why.” Even with the improved instruction, the answers were still weak, which needed a human to fill.

Platform Differences

The four tools were given the same instructions but their answers were different. Each processed the issue and offered solution in its own way. Among them, Claude stood out because it looked past the surface and identified issues like employees lacking safety and trust. The other tools ignored these issues. The issue was with its solutions, which were too vague. The style and structure of Copilot was opposite. Its approach produced a clear response that looked like it was ready to be printed. Past this professional mask, it stayed safe and did not offer any deep thoughts about why employees were unhappy, unlike ChatGPT which was helpful. The frustration came from ChatGPT providing extra details that were not important for the crisis this company was facing. Also, the simple answers from Gemini made me feel like it did not truly hear how complicated the problem was.

Results

I compared each analysis and suggestion and though that none was reliable on its own. I quickly realized that I could not trust just one tool on its own. Even the best responses viewed leadership problems as the man reason for low engagement. They did not look at too much work or pressure from outside. Some results looked complete at first. But when I kept reading, I realized they were not detailed. Others were hard to use. I realized that I had to examine every result for what is there, what is missing, and any misplaced detail hidden behind the confident wording.

Future Use

Honestly, this experiment changed how I see AI. I hoped that at work, I could just find ready solutions. But now I see AI as tools I can use to come up with ideas. They are good for organizing thoughts, but they are not the final word. The biggest change for me was accepting that real work stays on me. I have to check if results make sense, and adjust suggestions to fit the culture of my organization. I also have to make sure nothing important is left out. In a real organization, I would not blindly follow these results. They could easily lead to solutions that look good on paper but do not help employees to become more engaged with their work.

Recommendation/Conclusion

AI’s best use is to save time when coming up with ideas. You should never expect it to finish the work for you. You can never fully trust a single output. Every result should a draft that requires a human to look over for mistakes or shallow logic. I suggest taking ideas from all the different AI tools and merging them into one. This way, a second prompt can full in the holes that were missed by the first one. AI is too careful, and it struggled to understand the details of the company. It could not fit the organization’s situation perfectly and stayed away from controversial truths. I had to step in to rewrite its suggestions to make them useful for the situation. It is always important to check for missing information in AI summaries to make sure what it recommends can work for specific organizations.