Generativ kunstig intelligens



a photograph of an astronaut riding a horse
.bossa nova with electric guitar
.Borneo wildlife on the Kinabatangan River
.
Generativ kunstig intelligens (generativ AI, GenAI,[1] eller GAI) er en delmængde af kunstig intelligens, der bruger store generative modeller til at producere tekst, billeder, videoer eller andre former for data.[2][3][4] Disse modeller lærer de underliggende mønstre og strukturer af deres træningsdata og bruger dem til at producere nye data[5][6] baseret på input, som ofte kommer i form af tekst-input.[7][8]
Forbedringer i transformer-baserede dybe neurale netværk, især store sprogmodeller (LLM), muliggjorde et AI-boom af generative AI-systemer i begyndelsen af 2020'erne. Disse omfatter chatbots såsom ChatGPT, Copilot, Gemini og LLaMA; tekst-til-billede kunstig intelligens billedgenereringssystemer såsom Stable Diffusion, Midjourney og DALL-E; og tekst-til-video AI-generatorer såsom Sora.[9][10][11][12] Virksomheder som OpenAI, Anthropic, Microsoft, Google og Baidu samt adskillige mindre firmaer har udviklet generative AI-modeller.[7][13][14]
Generativ AI har anvendelser på tværs af en bred vifte af industrier, herunder softwareudvikling, sundhedspleje, finans, underholdning, kundeservice,[15] salg og marketing,[16] kunst, skrivning,[17] mode[18] og produktdesign.[19] Imidlertid er der blevet rejst bekymringer om det potentielle misbrug af generativ kunstig intelligens, såsom cyberkriminalitet, brugen af falske nyheder eller deepfakes til at bedrage eller manipulere mennesker og masseudskiftning af menneskelige job.[20][21] Bekymringer om intellektuel ejendomsret eksisterer også omkring generative modeller, der er trænet i og efterligner ophavsretligt beskyttede kunstværker.[22]
Referencer
- ^ Newsom, Gavin; Weber, Shirley N. (5. september 2023). "Executive Order N-12-23" (PDF). Executive Department, State of California. Arkiveret (PDF) fra originalen 21. februar 2024. Hentet 7. september 2023.
- ^ Pinaya, Walter H. L.; Graham, Mark S.; Kerfoot, Eric; Tudosiu, Petru-Daniel; Dafflon, Jessica; Fernandez, Virginia; Sanchez, Pedro; Wolleb, Julia; da Costa, Pedro F.; Patel, Ashay (2023). "Generative AI for Medical Imaging: extending the MONAI Framework". arXiv:2307.15208 [eess.IV].
- ^ "What is ChatGPT, DALL-E, and generative AI?". McKinsey. Hentet 2024-12-14.
- ^ "What is generative AI?". IBM. 22. marts 2024.
- ^ Pasick, Adam (2023-03-27). "Artificial Intelligence Glossary: Neural Networks and Other Terms Explained". The New York Times (amerikansk engelsk). ISSN 0362-4331. Arkiveret fra originalen 1. september 2023. Hentet 2023-04-22.
- ^ Karpathy, Andrej; Abbeel, Pieter; Brockman, Greg; Chen, Peter; Cheung, Vicki; Duan, Yan; Goodfellow, Ian; Kingma, Durk; Ho, Jonathan; Rein Houthooft; Tim Salimans; John Schulman; Ilya Sutskever; Wojciech Zaremba (2016-06-16). "Generative models". OpenAI. Arkiveret fra originalen 17. november 2023. Hentet 15. marts 2023.
- ^ a b Griffith, Erin; Metz, Cade (2023-01-27). "Anthropic Said to Be Closing In on $300 Million in New A.I. Funding". The New York Times. Arkiveret fra originalen 9. december 2023. Hentet 2023-03-14.
- ^ Lanxon, Nate; Bass, Dina; Davalos, Jackie (10. marts 2023). "A Cheat Sheet to AI Buzzwords and Their Meanings". Bloomberg News. Arkiveret fra originalen 17. november 2023. Hentet 14. marts 2023.
- ^ Metz, Cade (2023-03-14). "OpenAI Plans to Up the Ante in Tech's A.I. Race". The New York Times (amerikansk engelsk). ISSN 0362-4331. Arkiveret fra originalen 31. marts 2023. Hentet 2023-03-31.
- ^ Thoppilan, Romal; De Freitas, Daniel; Hall, Jamie; Shazeer, Noam; Kulshreshtha, Apoorv (20. januar 2022). "LaMDA: Language Models for Dialog Applications". arXiv:2201.08239 [cs.CL].
- ^ Roose, Kevin (2022-10-21). "A Coming-Out Party for Generative A.I., Silicon Valley's New Craze". The New York Times. Arkiveret fra originalen 15. februar 2023. Hentet 2023-03-14.
- ^ Metz, Cade (2024-02-15). "OpenAI Unveils A.I. That Instantly Generates Eye-Popping Videos". The New York Times (amerikansk engelsk). ISSN 0362-4331. Arkiveret fra originalen 15. februar 2024. Hentet 2024-02-16.
- ^ "The race of the AI labs heats up". The Economist. 2023-01-30. Arkiveret fra originalen 17. november 2023. Hentet 2023-03-14.
- ^ Yang, June; Gokturk, Burak (2023-03-14). "Google Cloud brings generative AI to developers, businesses, and governments". Arkiveret fra originalen 17. november 2023. Hentet 15. marts 2023.
- ^ Brynjolfsson, Erik; Li, Danielle; Raymond, Lindsey R. (april 2023), Generative AI at Work (Working Paper), Working Paper Series, doi:10.3386/w31161, arkiveret fra originalen 28. marts 2024, hentet 2024-01-21
- ^ "Don't fear an AI-induced jobs apocalypse just yet". The Economist. 2023-03-06. Arkiveret fra originalen 17. november 2023. Hentet 2023-03-14.
- ^ Coyle, Jake (2023-09-27). "In Hollywood writers' battle against AI, humans win (for now)". AP News. Associated Press. Arkiveret fra originalen 3. april 2024. Hentet 2024-01-26.
- ^ Harreis, H.; Koullias, T.; Roberts, Roger. "Generative AI: Unlocking the future of fashion". Arkiveret fra originalen 17. november 2023. Hentet 14. marts 2023.
- ^ "How Generative AI Can Augment Human Creativity". Harvard Business Review. 2023-06-16. ISSN 0017-8012. Arkiveret fra originalen 20. juni 2023. Hentet 2023-06-20.
- ^ Hendrix, Justin (16. maj 2023). "Transcript: Senate Judiciary Subcommittee Hearing on Oversight of AI". techpolicy.press. Arkiveret fra originalen 17. november 2023. Hentet 19. maj 2023.
- ^ Simon, Felix M.; Altay, Sacha; Mercier, Hugo (2023-10-18). "Misinformation reloaded? Fears about the impact of generative AI on misinformation are overblown". Harvard Kennedy School Misinformation Review (amerikansk engelsk). doi:10.37016/mr-2020-127. S2CID 264113883. Arkiveret fra originalen 17. november 2023. Hentet 16. november 2023.
- ^ "New AI systems collide with copyright law". BBC News. 2023-08-01. Hentet 2024-09-28.
Medier brugt på denne side
A synthograph of an astronaut riding a horse created in HuggingFace Space with Stable Diffusion 3.5 Large. Prompt is
a photograph of an astronaut riding a horse
. This artwork was created with text-to-image (txt2img) process. Forfatter/Opretter: Benlisquare, Licens: CC BY-SA 4.0
Demonstration of an algorithmically-generated audio track featuring bossa nova music accompanied by electric guitar, created using Riffusion, an open-source fine-tuned derivative of the Stable Diffusion image-generation diffusion model that has been retrained to generate images of audio spectrograms, which can then be converted into audio files.
An audio spectrogram is a visual representation of an audio clip's frequency content, and images of spectrograms can be converted into audio via short-time Fourier transform, using the Griffin-Lim algorithm to approximate phase during audio reconstruction. While the Stable Diffusion AI model is originally intended to generate visual images from a textual prompt, Riffusion has been retrained from Stable Diffusion v1.5 to instead generate spectrogram images from text prompts describing musical motifs, fine-tuned through the use of Nvidia A10G enterprise datacenter GPUs.
- Procedure/Methodology
The spectrograms were generated using the Riffusion Inference Server running the riffusion-model-v1 diffusion model, paired with the Riffusion App UI frontend. The following values were used:
- Prompt: "bossa nova with electric guitar"
- Seed Image: OG Beat
- Denoising: 0.75
This resulted in the output spectrogram image:

Spectrograms were then converted to WAV audio using this python script:
Forfatter/Opretter: Marxav, Licens: CC BY-SA 4.0
Architecture of a generative AI agent that uses a Large Language Model (LLM) and additional optional modules (data, tools, other models).
Forfatter/Opretter: Lwneal, Licens: CC0
Above: Schematic example of a discriminative neural network performing image recognition. Below: Example of a generative neural network performing text-to-image generation
A video generated from a text prompt using OpenAI's Sora. The prompt is as follows: "Borneo wildlife on the Kinabatangan River"