## Technologies and Tools
- Programming Languages: Python
- Frameworks and Libraries:
- PySide6 (Qt framework for GUI development)
- PyAudio (audio processing)
- OpenAI API (language model integration)
- Google Cloud Text-to-Speech API
- Porcupine (wake word detection)
- Whisper (speech recognition)
- spaCy (natural language processing)
- chromadb (vector database)
- Tools:
- PyInstaller (packaging Python applications)
- dmgbuild (creating macOS disk images)
## Functionality
The main project in this repository is the Jarvis Voice Assistant app. It is a voice-controlled AI assistant that listens for user commands and performs various tasks. Key functionalities include:
- Wake word detection using Porcupine
- Speech recognition using Whisper
- Natural language processing using OpenAI's language models (GPT-3.5 and GPT-4)
- Text-to-speech using Google Cloud Text-to-Speech API or free alternatives
- Storing conversation history and summaries using chromadb
- Emailing responses and reminders to the user
- Performing internet searches and synthesizing information
- Real-time chat interface with typing animations
- Customizable settings menu for API keys and configurations
## Relevant Skills
- Advanced GUI development using PySide6 (Qt) with custom widgets and animations
- Audio processing and streaming using PyAudio
- Integration of various AI technologies (speech recognition, language models, text-to-speech)
- Utilization of cloud APIs (OpenAI, Google Cloud)
- Implementation of wake word detection using Porcupine
- Natural language processing techniques using spaCy
- Storing and retrieving conversation history using chromadb vector database
- Packaging Python applications into executable formats using PyInstaller
- Creating macOS disk images using dmgbuild
- Multithreading and multiprocessing for handling background tasks and audio playback
- Error handling and graceful recovery mechanisms
- Modular and organized codebase with separate files for different functionalities
## Example Code
- Wake word detection and audio processing:
```python
def jarvis_process(jarvis_stop_event, jarvis_skip_event, queue, text_queue):
handle = pvporcupine.create(access_key=get_pico_key(), keywords=['Jarvis'], keyword_paths=[get_pico_wake_path()])
prep_mic()
start_audio_stream(handle.sample_rate, handle.frame_length)
while not jarvis_stop_event.is_set():
pcm = get_next_audio_frame(handle)
keyword_index = handle.process(pcm)
if keyword_index >= 0:
query_audio = listen_to_user()
query = convert_to_text(query_audio)
response = processor(query, skip=jarvis_skip_event, text_queue=text_queue)
audio_path = text_to_speech(response, model=get_model()['name'])
play_audio_file(audio_path, added_stop_event=jarvis_skip_event)
```
- Real-time chat interface with typing animations:
```javascript
function process_typing_queue() {
if (typing_queue.length > 0) {
var message = typing_queue[0];
var new_div = document.createElement('div');
new_div.innerHTML = message.formatted_text;
var body = document.getElementsByTagName('body')[0];
if (message.appear_as_typed) {
var span = new_div.getElementsByClassName('chat-bubble')[0].getElementsByTagName('span')[0];
span.innerHTML = "";
var typedText = "";
typing_speed = message.typing_delay;
var newText = message.formatted_text.match(/]*>([^<]+)<\/span>/)[1];
var index = 0;
function type() {
if (index < newText.length) {
span.innerHTML += newText.charAt(index);
index++;
window.scrollTo(0, document.body.scrollHeight);
setTimeout(type, typing_speed);
} else {
typing_queue.shift();
process_typing_queue();
}
}
type();
} else {
typing_queue.shift();
process_typing_queue();
}
}
}
```
## Notable Achievements
- Development of a fully functional voice-controlled AI assistant with advanced capabilities
- Integration of multiple AI technologies and APIs to create a seamless user experience
- Implementation of a real-time chat interface with typing animations for enhanced user engagement
- Efficient handling of conversation history and summaries using a vector database
- Customizable settings menu for easy configuration of API keys and preferences
- Packaging the application into an executable format for easy distribution and installation
The Jarvis Voice Assistant project demonstrates strong skills in Python development, AI integration, audio processing, GUI development, and overall software engineering practices. The use of advanced technologies and the implementation of user-friendly features showcase the developer's ability to create robust and engaging applications.