Wunjo.wladradchenko.ru @wladradchenko

Wunjo AI: Advanced Speech & Deepfake Neural Network Tool

Documentation
Issue · Discussions · Tutorial

About

Unlock the unparalleled capabilities of neural networks with Wunjo AI. Whether you're delving into speech synthesis, crafting deepfake animations, drawing Stable Diffusion video by text prompt or video making, Wunjo AI has got you covered.

Key Features:

Speech Synthesis: Effortlessly convert text into human-like speech.
Voice Cloning: Clone voices from provided audio files or directly record your voice within the app for real-time cloning.
Multilingual Support: Currently supports English, Russian, Chinese for voice cloning (from any language audio) and English, Russian synthesis, you can also offer your language for voice cloning. Multilanguage speech by one text.
Real-time Speech Recognition: Dictate text and get instant transcriptions. An efficient tool for hands-free content creation.
Multidialogue Creation: Craft multi-dialogues using unlimited characters with distinct voice profiles.
Audio processing Speech enhancement and separating vocals and music.
Video-to-Video by Text Prompt:
- Reshape videos with by text prompt with difference models of Stable Diffusion. Let generative neural networks craft a new visual narrative.
- Change individual objects in a video by text prompt with one click, changing them throughout the video with unique text queries.
- Preserve specific objects without change by using the «pass» keyword.
- Change video style by changed img2img frame,
Deepfake Animation:
- Animate faces using just one photo combined with audio.
- Achieve precise lip syncing with your audio using our deepfake lips feature.
- Effortlessly swap faces in videos, GIFs, and photos using just a single photograph with our "Face Swap" feature.
- Experimental feature. Change the emotions of a person in the video, with the help of a text description.
AI Retouch Tool: Elevate your videos by removing unwanted objects or refining the quality of your deepfakes. Automatic removal of animated text.
Automatic Segmentation Mask: Select any object at any time period and get a storyboard of the selected object with a transparent or colored background.

Applications: From voiceovers in commercials to character voicing in games, from audiobook narrations to fun deepfake projects, Wunjo AI offers endless possibilities and all is free and local on your device.

Why Choose Wunjo AI?:

All-in-One: A comprehensive tool catering to both your voice and visual AI needs.
User-friendly: Designed for all, from beginners to professionals.
Privacy First: Functions locally on your desktop, ensuring your data remains private.
Open-source & Free: Benefit from community-driven enhancements and enjoy the app without any cost.

Step into the future of AI-powered creativity with Wunjo AI.

Setup

Requirements Python version 3.10 and ffmpeg.

For detailed instructions about setup Wunjo AI from GitHub, refer to the Launch Project from GitHub section in our wiki.

You will find on website official installer or portable versions.

Install packets

Ubuntu / Debian v1.6 (GPU version)

For detailed instructions about install Wunjo AI on Ubuntu / Debian OS from installer

MacOS v1.6 (CPU version)

Due to the fact that the author of the project does not have an Apple license, there is currently no way to create an official installer.

Windows v1.6 (CPU version)

For detailed instructions about install Wunjo AI on Windows from installer

Read in Wunjo AI documentation how use GPU on Windows.

Example

Speech synthesis and voice cloning

Face animation from image src

Original	Fix face + Enhancer

Mouth animation from video src

Original	Mouth animation	Mouth animation + Enhancer

Face swap by one photo

Original photo	Original video	Face swap + Background enhancer

Remove object by Retouch AI

Original video	Remove object

Retouch AI to improve quality of deepfake

Defective lines on the chins after animation lip	Retouch lines on the chins + Face swap

Get segmentation mask by one click

Original	Mask

Video-to-Video by Text Prompt (Only for GPU)

The higher the video resolution, the better the quality of the drawn frames.

Video resolution 512x512 default model for deepfake

Original	Blonde hair + Brown jacket

Video resolution 512x512 custom model for anime

Additionally, you can use your custom stable diffusion model to redraw video or objects in video with difference timeline.

Original	Object pass + Background change

Object change + Enhancement move	Object change + Enhancement anime

Limit resolution video by GPU VRAM

32 GB	23 GB	18 GB	14 GB	10 GB	8 GB	7 GB
1280x1280	1080x1080	1024x1024	768x768	640x640	576x576	512x512

Emotion deepfake [Experimental]

This is an experimental feature that is under development, but you can take a look at some of the work right now in Wunjo AI.

Original	Happy	Angry

Fear	Sad	Disgust

Language

The application comes with built-in support for the following languages: English, Russian, Chinese, Portuguese, and Korean.

If you wish to add a new language:

Navigate to .wunjo/settings/settings.json. Add your desired language in the format: "default_language": {"name": "code"}. To find the appropriate code for your language, please refer to the Google Cloud Translate Language Codes.

Update

Update 1.6.0

[x] Improved and automated remove object from image or video
[x] Improved edit video element
[x] Added auto segmentation mask with save
[x] Added Video2Video with ControlNet by text prompt tool
[x] Added InpaintVideoMask2Video with ControlNet by text prompt tool
[x] Optimized using memory for face swapping for long video
[x] Optimized using memory for retouch and remove object for long video

Update 1.6.1

[x] Fix bug with enhancer. Improve enhancer for video and face. Added enhancer for drawing video
[x] Improved vocoder for voice cloning
[x] Added cloning speed speech
[x] Added model to get background sound for deepfakes and clear voice without background noise
[x] Added feature to get background noise from audio or video
[x] Improved encoder for voice cloning
[x] ~~Imitate voice emotions~~ and improved voice cloning quality.
[x] Reducing the amount of RAM used for mouth animation and improving video quality
[x] Added speech enhancement
[x] ~~Music generation~~
[x] Added module a video style change by images
[x] Multilanguage speech by one text
[x] Added auto remove text from video or image with auto create text mask

Update 1.6.2

[x] Custom browser for WebGUI
[x] Added select browser to run
[x] Added check between offline and online mode
[x] Added a message for the user about missing models and how to download manually

Update 2.0.0

Will there be a v2 version? Yes!

Unlike the familiar Wunjo AI v1, the upcoming v2 will not merely be an update; it's a standalone version, signaling parallel development alongside v1 and introducing a new realm of possibilities.

While v1 was designed as a user-friendly application, simplifying processes like creating deepfakes and speech synthesis into single-click actions with minimal entry barriers, v2 is envisioned as a professional-grade editor, fostering limitless creativity.

In Wunjo v2, users will experience unparalleled freedom, being able to craft their own Node bundles, saving and loading them effortlessly.

A some screenshots showcasing current developing progress.

Node logical	Menu

For more insights into the progress of Wunjo V2's development, join our community at Blog on Telegram. Stay tuned for sneak peeks!

Video

Review	How install on Windows?

Support the Project

You can support the author of the project in the development of his creative ideas, or just treat him to a cup of coffee in USD or a slice of pizza in RUB. There are other ways to support the development of the project, more details on page.

Buy a cup of coffee in USD	Buy a slice of pizza in RUB

Supporters and Donors

I extend heartfelt gratitude to the following individuals who have generously supported this project through donations:

Konstantin Kravtsov.
Several contributors who have chosen to remain anonymous or opted not to be listed publicly. Your support is immensely appreciated.

I sincerely appreciate the generosity of all project supporters. Your contributions enable me to continue improving and maintaining this project.

Contact

Owner: Wladislav Radchenko

Email: i@wladradchenko.ru

Project: https://github.com/wladradchenko/wunjo.wladradchenko.ru

Web site: wladradchenko.ru/wunjo

Premise

Wunjo comes from the ancient runic alphabet and represents joy and contentment, which could tie into the idea of using the application to create engaging and expressive speech. Vunyo (ᚹ) is the eighth rune of the Elder and Anglo-Saxon Futhark. Prior to the introduction of the letter W into the Latin alphabet, the letter Ƿynn (Ƿƿ) was used instead in English, derived from this rune.

Credits

Wunjo AI is built upon the remarkable work of various open-source projects. Each integrated component reflects a commitment to improving and adapting existing technologies within the collaborative landscape of open-source development. The list below highlights the projects that have been adapted and enhanced for inclusion in Wunjo AI:

Speech Synthesis & Voice Cloning: Adapted versions of Tacotron 2, Waveglow, and improved Real-Time Voice Cloning with VoiceFixer
User Interface & Packaging: Implementations of Flask UI and BeeWare
Audio Processing: Adapted Open-Unmix for audio separation
Facial Animation & Enhancement: Adapted versions of Wav2lip, Face Utils
Image & Video Enhancement: Adapted Real-ESRGAN for superior quality enhancements
Video Processing & Segmentation: Adaptations of Segment Anything, Rerender a Video, GMFlow, ControlNet and upgraded Ebsynth
AI Art Generation: Adaptation of Stable Diffusion for creative video AI-driven art

I extend my deepest gratitude to the original contributors of these technologies. Their groundbreaking work has been instrumental in advancing the capabilities of Wunjo AI. For the persistent storage and versioning of the models I have personally trained, I utilize the Hugging Face Model Storage. If you're interested in contributing to Wunjo AI, especially in the area of voice cloning for new languages, please feel free to propose your models or reach out for collaboration via GitHub Discussions or the Hugging Face.

[to top]

Best Flask open-source libraries and packages

Wunjo.wladradchenko.ru