Leveraging Gemini 1.5 Multimodal model(Generative AI) for Software development

Monika Kumar Jethani
Google Developer Experts
4 min readApr 11, 2024

--

Image Source: https://dataedo.com/asset/img/banners/blog/cartoons.png

Google recently launched Gemini 1.5 Pro model, which is a mid-sized multimodal model optimised for scaling across wide range of tasks.
In this blog, we will learn how Gemini 1.5 Pro model can help us during software development.

This blog is an improved and recent version of my previous blog,

All examples in this blog use the freeform prompt in Google AI Studio and Gemini 1.5 pro model.

Below are some of the ways in which Gemini 1.5 Pro can help software developers,

1- Getting the answers by skimming through the content
Suppose you want to know answers to specific questions quickly rather than spending time in reading or viewing that text or video content. In that case, you can upload those files to Google AI Studio’s freeform prompt and ask your questions.
In case if you have multiple questions, you can use a chat prompt in Google AI studio.

Below example shows how the model reads the pdf file for me to get the answer to my query.

Below example shows how the model views the video file to get me the answer to my query.

2- Getting the Accessibility descriptions to use in the application

To enable our application to be used by diverse audience, as per W3C, its necessary that we provide a content description for all the visual content in our application.
We can get this by uploading the image file to the model to get the accessibility description for that content. This accessibility description can then be used in our applications.

3- Helps in understanding the code

I wanted to understand the code in this Github repository and how an offline-first app is built, so I downloaded it and choose “Folder upload” option in Google AI studio, post that I asked the model my queries and received below results.

4- Helps in understand UML Diagrams

If there are certain UML Diagrams that you’d want to get explanation for, to get a better understanding of the diagram, just upload the image and type an input prompt or start asking your queries about the UML Diagram and you’d receive the answers.

5- Helps in Code review

If you’d want to receive suggestions on a piece of code, just type it or take a screenshot of it and upload the image to Google AI studio and ask the model for a code review and you’d be surprised by the detailing in the review given as well as an improved version of the code snippet is provided.

Conclusion

The new Gemini 1.5 pro multimodal model can improve a software developer’s productivity by assisting them in various software development tasks and empowering them to get a better understanding of the code and software content.
They can create prompts for the above tasks and save and access them from the Google AI Studio from the ‘My library’ tab — https://aistudio.google.com/app/library

At the end of this blog post, I would encourage software developers to start leveraging Google’s Generative AI suite in various software development lifecycle tasks.

References

https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html

--

--