Conversational text Analysis using various NLP techniques

Last update: Dec 25, 2022

Overview

PyConverse

Let me try first

Installation

pip install pyconverse

Usage

Please try this notebook that demos the core functionalities: basic usage notebook

Introduction

Conversation analytics plays an increasingly important role in shaping great customer experiences across various industries like finance/contact centres etc... primarily to gain a deeper understanding of the customers and to better serve their needs. This library, PyConverse is an attempt to provide tools & methods which can be used to gain an understanding of the conversations from multiple perspectives using various NLP techniques.

Why PyConverse?

I have been doing what can be called conversational text NLP with primarily contact centre data from various domains like Financial services, Banking, Insurance etc for the past year or so, and I have not come across any interesting open-source tools that can help in understanding conversational texts as such I decided to create this library that can provide various tools and methods to analyse calls and help answer important questions/compute important metrics that usually people want to find from conversations, in contact centre data analysis settings.

Where can I use PyConverse?

The primary use case is geared towards contact centre call analytics, but most of the tools that Converse provides can be used elsewhere as well.

There’s a lot of insights hidden in every single call that happens, Converse enables you to extract those insights and compute various kinds of KPIs from the point of Operational Efficiency, Agent Effectiveness & monitoring Customer Experience etc.

If you are looking to answer questions like these:-

What was the overall sentiment of the conversation that was exhibited by the speakers?
Was there periods of dead air(silence periods) between the agents and customer? if so how much?
Was the agent empathetic towards the customer?
What was the average agent response time/average hold time?
What was being said on calls?

and more... pyconverse might be of small help.

What can PyConverse do?

At the moment pyconverse can do a few things that broadly fall into these categories:-

Emotion identification
Empathetic statement identification
Call Segmentation
Topic identification from call segments
Compute various types of Speaker attributes:
1. linguistic attributes like: word counts/number of words per utterance/negations etc.
2. Identify periods of silence & interruptions.
3. Question identification
4. Backchannel identification
Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:
1. Talkative, verbally fluent
2. Informal/Personal/social
3. Goal-oriented or Forward/future-looking/focused on past
4. Identify inhibitions

What Next?

Improve documentation.
Add more use case notebooks/examples.
Improve some of the functionalities and make it more streamlined.

Built with:

Transformers	Spacy	Pytorch

Credits:

Note: The backchannel Utterance classification method is inspired by facebook's Unsupervised Topic Segmentation of Meetings with BERT Embeddings paper (arXiv:2106.12978 [cs.LG])

It is a system used to detect bone fractures. using techniques deep learning and image processing

MohammedHussiengadalla-Intelligent-Classification-System-for-Bone-Fractures It is a system used to detect bone fractures. using techniques deep learni

7 Nov 11, 2022

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

8 Nov 1, 2022

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab

398 Dec 30, 2022

Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

126 Nov 22, 2022

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 7, 2023

Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

47 Jan 3, 2023

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Opytimizer: A Nature-Inspired Python Optimizer Welcome to Opytimizer. Did you ever reach a bottleneck in your computational experiments? Are you tired

546 Dec 31, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

Comments

SemanticTextSegmentation NaN With All Stop Words

When running semantic text segmentation, I found that if the input utterance line is all stop words, (i.e. "Bye. Uh huh. Yeah."), SemanticTextSegmentation._get_similarity fails with ValueError: Input contains NaN.

I found that adding a check for nan in both embeddings could solve this problem.

def _get_similarity(self, text1, text2):
    sentence_1 = [i.text.strip()
                  for i in nlp(text1).sents if len(i.text.split(' ')) > 1]
    sentence_2 = [i.text.strip()
                  for i in nlp(text2).sents if len(i.text.split(' ')) > 2]
    embeding_1 = model.encode(sentence_1)
    embeding_2 = model.encode(sentence_2)
    embeding_1 = np.mean(embeding_1, axis=0).reshape(1, -1)
    embeding_2 = np.mean(embeding_2, axis=0).reshape(1, -1)

    if np.any(np.isnan(embeding_1)) or np.any(np.isnan(embeding_2)):
            return 1

    sim = cosine_similarity(embeding_1, embeding_2)
    return sim

I would like to have someone else look at it because I don't want to make any assumptions that the stop words should be part of the same segments.

opened by Haowjy 1

Updated lru_cache decorator.

After installing and running the library pyconverse on python-3.7 or below and using the import statement it gives error in import itself. I went through the utils file and saw that the "@lru_cache" decorator was written as per the new python(i.e. 3.8+) style hence when calling in older versions(py 3.7 and below it raises a NoneType Error) as the LRU_CACHE decorator is written as -" @lru_cache() " with paranthesis for older versions . Hence made the changes. The changes made do not cause any error on the newer versions.

opened by AkashKhamkar 0
Error in importing Callyzer, SpeakerStats

When I want to load the model it's showing this error.Whether it is currently in devloped mode

KeyError: "[E002] Can't find factory for 'tok2vec'. This usually happens when spaCy callsnlp.create_pipewith a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to Language.factories['tok2vec'] or remove it from the ### model meta and add it vianlp.add_pipeinstead.

opened by kalpa277 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2021)
First Release of PyConverse library.

Conversational Transcript Analysis using various NLP techniques.

Emotion identification

Empathetic statement identification

Call Segmentation

Topic identification from call segments

Compute various types of Speaker attributes:

linguistic attributes like : word counts/number of words per utterance/negations etc

Identify periods of silence & interruptions.

Question identification

Backchannel identification

Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:

Talkative, verbally fluent

Informal/Personal/social

Goal-oriented or Forward/future-looking/focused on past

Identify inhibitions

Source code(tar.gz)
Source code(zip)

Owner

Rita Anjana

ML engineer

GitHub Repository

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

DBSegment This tool generates 30 deep brain structures segmentation, as well as a brain mask from T1-Weighted MRI. The whole procedure should take ~1

2 Oct 25, 2022

Python parser for DTED data.

DTED Parser This is a package written in pure python (with help from numpy) to parse and investigate Digital Terrain Elevation Data (DTED) files. This

12 Dec 18, 2022

Active window border replacement for window managers.

xborder Active window border replacement for window managers. Usage git clone https://github.com/deter0/xborder cd xborder chmod +x xborders ./xborder

250 Dec 30, 2022

Scalable machine learning based time series forecasting

mlforecast Scalable machine learning based time series forecasting. Install PyPI pip install mlforecast Optional dependencies If you want more functio

145 Dec 24, 2022

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

150 Dec 12, 2022

Data pipelines for both TensorFlow and PyTorch!

rapidnlp-datasets Data pipelines for both TensorFlow and PyTorch ! If you want to load public datasets, try: tensorflow/datasets huggingface/datasets

1 Dec 08, 2021

《Train in Germany, Test in The USA: Making 3D Object Detectors Generalize》(CVPR 2020)

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize This paper has been accpeted by Conference on Computer Vision and Pattern Rec

101 Jan 02, 2023

:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling

bulbea "Deep Learning based Python Library for Stock Market Prediction and Modelling." Table of Contents Installation Usage Documentation Dependencies

1.8k Jan 05, 2023

Code for the Paper: Conditional Variational Capsule Network for Open Set Recognition

Conditional Variational Capsule Network for Open Set Recognition This repository hosts the official code related to "Conditional Variational Capsule N

35 Nov 21, 2022

Interpolation-based reduced-order models

Interpolation-reduced-order-models Interpolation-based reduced-order models High-fidelity computational fluid dynamics (CFD) solutions are time consum

1 Jan 10, 2022

UMPNet: Universal Manipulation Policy Network for Articulated Objects

UMPNet: Universal Manipulation Policy Network for Articulated Objects Zhenjia Xu, Zhanpeng He, Shuran Song Columbia University Robotics and Automation

33 Dec 03, 2022

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021) This repo presents PyTorch implementation of M

79 Dec 19, 2022

A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

98 Sep 20, 2022

It's like Shape Editor in Maya but works with skeletons (transforms).

Skeleposer What is Skeleposer? Briefly, it's like Shape Editor in Maya, but works with transforms and joints. It can be used to make complex facial ri

1 Nov 11, 2022

Convolutional Neural Networks

Darknet Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. D

23.7k Jan 05, 2023

Piotr - IoT firmware emulation instrumentation for training and research

Piotr: Pythonic IoT exploitation and Research Introduction to Piotr Piotr is an emulation helper for Qemu that provides a convenient way to create, sh

51 Nov 09, 2022

Implementations of CNNs, RNNs, GANs, etc

Tensorflow Programs and Tutorials This repository will contain Tensorflow tutorials on a lot of the most popular deep learning concepts. It'll also co

1k Dec 30, 2022

A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022

AirCode: A Robust Object Encoding Method

AirCode This repo contains source codes for the arXiv preprint "AirCode: A Robust Object Encoding Method" Demo Object matching comparison when the obj

30 Dec 09, 2022

Atomistic Line Graph Neural Network

Table of Contents Introduction Installation Examples Pre-trained models Quick start using colab JARVIS-ALIGNN webapp Peformances on a few datasets Use

91 Dec 30, 2022