Microsoft Phi-4 Multimodal: AI wanda ke fahimtar murya, hotuna da rubutu

Sabuntawa na karshe: 27/02/2025

  • Microsoft ya ƙaddamar da Phi-4-multimodal, samfurin AI wanda ke sarrafa murya, hotuna da rubutu lokaci guda.
  • Tare da sigogi biliyan 5.600, ya fi girma samfura a cikin murya da fahimtar hangen nesa.
  • Ya haɗa da Phi-4-mini, sigar da aka mayar da hankali kawai akan ayyukan sarrafa kalmomi.
  • Akwai akan Azure AI Foundry, Hugging Face, da NVIDIA, tare da aikace-aikace iri-iri a cikin kasuwanci da ilimi.
Menene Phi-4 multimodal-0

Microsoft ya ɗauki mataki na gaba a duniyar ƙirar harshe tare da multimodal Phi-4, sabon sa kuma mafi ci gaba na fasaha na wucin gadi mai ikon sarrafa rubutu, hotuna da murya lokaci guda. Wannan samfurin, tare da Phi-4-mini, yana wakiltar a Juyin Halitta a cikin ƙarfin ƙananan samfura (SLM), yana ba da inganci da daidaito ba tare da buƙatar ɗimbin sigogi ba.

Zuwan Phi-4-multimodal ba kawai yana wakiltar haɓakar fasaha ga Microsoft ba, har ma Yana gogayya kai tsaye tare da manyan samfura kamar na Google da Anthropic. Ingantattun tsarin gine-ginensa da ingantaccen iyawar tunani sun sa shi zaɓi mai ban sha'awa don aikace-aikacen da yawa, daga fassarar na'ura zuwa hoto da tantance murya.

Keɓaɓɓen abun ciki - Danna nan  Ta yaya za a iya daidaita saƙonnin amsa Alexa?

Menene Phi-4-multimodal kuma ta yaya yake aiki?

Phi-4 Microsoft

Phi-4-multimodal samfurin AI ne wanda Microsoft ya haɓaka wanda zai iya sarrafa rubutu, hotuna da murya lokaci guda.. Ba kamar samfuran gargajiya waɗanda ke aiki tare da tsari ɗaya ba, wannan basirar wucin gadi tana haɗa hanyoyin samun bayanai daban-daban zuwa sararin wakilci guda ɗaya, godiya ga amfani da dabarun koyo.

An gina samfurin a kan gine-gine na 5.600 biliyan sigogi, ta amfani da wata dabara da aka sani da LoRAs (Ƙaramar Matsayin Adafta) don haɗa nau'ikan bayanai daban-daban. Wannan yana ba da damar ƙarin daidaito cikin sarrafa harshe da zurfin fassarar mahallin.

Mabuɗin iyawa da fa'idodi

Phi-4-multimodal yana da tasiri musamman a ayyuka masu mahimmanci waɗanda ke buƙatar babban matakin hankali na wucin gadi:

  • Bayanin magana: Ya fi ƙwararrun ƙira irin su WhisperV3 a cikin rubuce-rubuce da gwaje-gwajen fassarar inji.
  • sarrafa hoto: Yana da ikon fassara takardu, zane-zane da yin OCR tare da daidaito mai girma.
  • Ƙarfin Latency: Wannan yana ba shi damar yin aiki akan na'urorin hannu da ƙananan ƙarfi ba tare da sadaukar da aikin ba.
  • Haɗin kai mara kyau tsakanin hanyoyin: Ikon fahimtar rubutu, magana da hotuna tare yana inganta tunanin mahallin su.
Keɓaɓɓen abun ciki - Danna nan  Mafi kyawun dabaru don samun mafi kyawun NotebookLM akan Android: Cikakken jagora

Kwatanta da sauran samfura

PHI-4-aiki multimodal

Dangane da aiki, Phi-4-multimodal ya tabbatar da kasancewa daidai da manyan samfura. Idan aka kwatanta da Gemini-2-Flash-lite da Claude-3.5-Sonnet, yana samun sakamako iri ɗaya a cikin ayyuka na multimodal, yayin da yake ci gaba da ingantaccen aiki godiya ga ƙirar ƙira.

Duk da haka, yana gabatar da wasu iyakoki a cikin tambayoyin tushen murya da amsoshi, Inda samfuran kamar GPT-4o da Gemini-2.0-Flash suna da fa'ida. Wannan ya faru ne saboda ƙaramin girman samfurinsa, wanda ke tasiri ga riƙe ilimin gaskiya. Microsoft ya nuna cewa yana aiki don inganta wannan damar a cikin sigogin gaba.

Phi-4-mini: ƙane na Phi-4-multimodal

Tare da Phi-4-multimodal, Microsoft kuma ya ƙaddamar Phi-4-mini, bambance-bambancen da aka inganta don takamaiman ayyuka na tushen rubutu. An tsara wannan samfurin don bayarwa babban inganci a sarrafa harshe na halitta, yana mai da shi manufa don chatbots, mataimaka na gani, da sauran aikace-aikacen da ke buƙatar ingantaccen fahimta da tsara rubutu.

Kasancewa da aikace-aikace

Menene Phi-4 multimodal-5

Microsoft ya sanya Phi-4-multimodal da Phi-4-mini samuwa ga masu haɓaka ta hanyar Azure AI Foundry, Hugging Face, da NVIDIA API Catalog. Wannan yana nufin cewa duk wani kamfani ko mai amfani da ke da damar yin amfani da waɗannan dandamali zai iya fara gwaji tare da ƙirar kuma a yi amfani da shi a yanayi daban-daban.

Keɓaɓɓen abun ciki - Danna nan  Goku AI: Duk game da ci gaba na AI mai samar da bidiyo

Ganin tsarin sa na multimodal, Phi-4 shine An yi nufin sassa kamar:

  • Fassarar na'ura da rubutun rabe-rabe na ainihi.
  • Gane takarda da bincike don kasuwanci.
  • Aikace-aikacen hannu tare da mataimaka masu hankali.
  • Samfuran ilimi don haɓaka koyarwar tushen AI.

Microsoft ya ba da wani karkatarwa mai ban sha'awa tare da waɗannan samfuran ta hanyar mai da hankali kan inganci da haɓaka. Tare da haɓaka gasa a fagen ƙananan ƙirar harshe (SLM), Phi-4-multimodal an gabatar da shi azaman madaidaicin madadin ga manyan samfura, yana ba da daidaituwa tsakanin aiki da iya aiki samuwa ko da akan na'urori marasa ƙarfi.