Automatic Image Captioning with Pytorch

Project Overview

Image Captioning is the process of automatically captioning a unseen image. It uses both Natural Language Processing and Computer Vision to generate the captions. Below are some examples of images on whoch automatic image captioning has been used to generate captions.

Network Topology

Encoder:
For the Encoder, I used a Convolutional Neural Network(CNN). The image is given to CNN to extract the relevant features. The last hidden state in CNN is connected to Decoder The encoder uses a pre-trained ResNet-50 architecture (with the final fully-connected layer removed) to extract features from a batch of pre-processed images. The output is then flattened to a vector, before being passed through a Linear layer to transform the feature vector to have the same size as the word embedding

Decoder:
For the Decoder, I used LSTM(Long short-term memory)units which take the features from the encoder and produces a sentence

Dataset

The Microsoft Common Objects in COntext (MS COCO) dataset is a large-scale dataset for scene understanding. The dataset is commonly used to train and benchmark object detection, segmentation, and captioning algorithms.

Local Environment Instructions

Clone the repository, and navigate to the downloaded folder. This may take a minute or two to clone due to the included image data.
```
git clone https://github.com/rohitvk1/Automatic-Image-Captioning-with-Pytorch.git
```

Create (and activate) a new Anaconda environment (Python 3.6). Download via Anaconda

Linux or Mac:

conda create -n cv-nd python=3.6
source activate cv-nd

Windows:

conda create --name cv-nd python=3.6
activate cv-nd

Install PyTorch and torchvision; this should install the latest version of PyTorch;

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

Install a few required pip packages, which are specified in the requirements text file (including OpenCV).

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
models		models
README.md		README.md
data_loader.py		data_loader.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
vocab.pkl		vocab.pkl
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automatic Image Captioning with Pytorch

Project Overview

Network Topology

Dataset

Local Environment Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rohit-vijayakumar/Automatic-Image-Captioning-with-Pytorch

Folders and files

Latest commit

History

Repository files navigation

Automatic Image Captioning with Pytorch

Project Overview

Network Topology

Dataset

Local Environment Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages