Home Software Introducing PaliGemma 2 combine: A vision-language mannequin for a number of duties

Introducing PaliGemma 2 combine: A vision-language mannequin for a number of duties

12
0

This previous December, we launched PaliGemma 2, an upgraded vision-language mannequin within the Gemma household. The discharge included pretrained checkpoints of various sizes (3B, 10B, and 28B parameters) that may be simply fine-tuned on a variety of vision-language duties and domains, corresponding to picture segmentation, brief video captioning, scientific query answering and text-related duties with excessive efficiency.

Now, we’re thrilled to announce the launch of PaliGemma 2 combine checkpoints. PaliGemma 2 combine are fashions tuned to a mix of duties that enable straight exploring the mannequin capabilities and utilizing it out-of-the-box for frequent use circumstances.

What’s new in PaliGemma 2 combine?

  • A number of duties with one mannequin: PaliGemma 2 combine can clear up duties corresponding to brief and lengthy captioning, optical character recognition (OCR), picture query answering, object detection and segmentation.
  • Developer-friendly sizes: Use the perfect mannequin to your wants because of the completely different mannequin sizes (3B, 10B, and 28B parameters) and resolutions (224px and 448px).

In the event you had been already utilizing the unique PaliGemma combine checkpoints, you’ll be able to straight improve to PaliGemma 2 without having to do any modifications. The mannequin performs completely different duties relying on the way it’s prompted. You’ll be able to evaluate the completely different immediate activity syntax within the official documentation and be taught extra about how PaliGemma 2 was developed in our technical report.


Detection

  • Activity: Detection (PaliGemma-2-3b-mix-224)
  • Enter: “detect androidn”

End result: a cow standing on a seashore subsequent to an indication that claims warning harmful rip present.

Optical Character Recognition (OCR)

End result: A cow standing on a seashore subsequent to a warning signal.

End result:

WARNING DANGEROUS

RIP CURRENT


Get Began Right this moment

Prepared to find the potential of PaliGemma 2? Right here is how one can discover the combo mannequin capabilities:

  • Check out the combo mannequin with just a few clicks: Discover the combo mannequin capabilities straight on the Hugging Face demo.
  • Learn to run the mannequin: Check out the Keras inference pocket book straight in Google Colab or regionally.

Whereas PaliGemma 2 combine has robust efficiency throughout a number of duties, you’re going to get the perfect outcomes by fine-tuning PaliGemma 2 in your individual activity or area. To learn to do it, dive into our complete documentation, examine our official instance notebooks for Keras and JAX, or use the Hugging Face transformers instance. We’re trying ahead to seeing what you construct with it!

Previous articleHidden Markov Mannequin (HMM) Overview
Next articleMost well-liked Pricing on Worker Advantages – Unique to IA Members

LEAVE A REPLY

Please enter your comment!
Please enter your name here