LinguaNet: A Neural Network Based Language Identification System

Project Overview

LinguaNet is a language identification model built on Deep Neural Networks (DNNs) using Python and TensorFlow. It employs character n-grams to classify the language of any given text input. The system is capable of ingesting user inputs and efficiently outputting the identified language.

This project includes the visualization of learned features using techniques such as t-SNE and PCA to give insights into how the network differentiates between languages. The model has demonstrated a final accuracy of 89.2%, illustrating its effectiveness in language identification.

Dependencies

Python (>=3.7)
TensorFlow (>=2.4.0)
Scikit-learn (>=0.24.0)
Numpy (>=1.19.5)
Matplotlib (>=3.3.2)

Installation

Clone the repo

git clone https://github.com/mohamzamir/LinguaNet-A-Neural-Network-Based-Language-Identification-System

Example

1. When you load the page, it is visible to you like this

2. When you click on the languages, it gives you example of the languages you can use to recognize.

3. In this example, I have put one Spanish sentence in the Langauge box.

4. When 'Submit' button is clicked, it shows the identified langauge.

Model

The model is a Deep Neural Network (DNN) implemented using Python and TensorFlow. It utilizes character n-grams to create distinct feature sets for different languages, and these features are used to classify the language of a given text input.

To gain insights into the feature representation, we visualized the learned features of the model using techniques such as t-SNE and PCA.

Performance

The LinguaNet model achieved a final accuracy of 89.2%, highlighting its effectiveness in identifying and differentiating languages based on text input.

Future Work

We plan to refine the model further, aiming to increase its accuracy and expand its scope to include more languages. Contributions and feedback are welcome.

License

Copyright [2023] [Amir Hamza]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Best Flask open-source libraries and packages

LinguaNet A Neural Network Based Language Identification System