survival8

Monday, August 24, 2020

Sentiment Analysis Books (Aug 2020)

Google Search String: "sentiment analysis books"

1.
Sentiment Analysis: Mining Opinions, Sentiments, and Emotions
Book by Bing Liu
Originally published: 28 May 2015
Author: Bing Liu
Genre: Reference work

2.
Sentiment Analysis and Opinion Mining
Book by Bing Liu
Originally published: 2012
Author: Bing Liu

3.
A Practical Guide to Sentiment Analysis
Book
Originally published: 7 April 2017
Erik Cambria, Dipankar Das, Sivaji Bandyopadhyay, Antonio Feraco (eds.)
Springer International Publishing

4.
Sentiment Analysis in Social Networks
Book
Originally published: 30 September 2016

5.
Opinion Mining and Sentiment Analysis
Book by Bo Pang and Lillian Lee
Originally published: 2008
Authors: Bo Pang, Lillian Lee

6.
Deep Learning-Based Approaches for Sentiment Analysis
Book
Originally published: 24 January 2020

7.
Text Mining with R: A Tidy Approach
Book by David Robinson and Julia Silge
Originally published: 2017

8.
Advanced Positioning, Flow, and Sentiment Analysis in Commodity Markets: Bridging Fundamental and Technical Analysis
Book by Mark J. S. Keenan
Originally published: 20 December 2019

9.
Prominent Feature Extraction for Sentiment Analysis
Book by Basant Agarwal and Namita Mittal
Originally published: 14 December 2015

10.
Handbook of Sentiment Analysis in Finance
Book
Originally published: 2016

11.
Sentiment Analysis and Knowledge Discovery in Contemporary Business
Book
Originally published: 25 May 2018

12.
Visual and Text Sentiment Analysis Through Hierarchical Deep Learning Networks
Book by Arindam Chaudhuri
Originally published: 6 April 2019
Author: Arindam Chaudhuri

13.
Affective Computing and Sentiment Analysis: Emotion, Metaphor and Terminology
Book
Originally published: 2011
Editor: Khurshid Ahmad

14.
Sentiment Analysis and Ontology Engineering: An Environment of Computational Intelligence
Book
Originally published: 22 March 2016

15.
Advances in Social Networking-based Learning: Machine Learning-based User Modelling and Sentiment Analysis
Book by Christos Troussas and Maria Virvou
Originally published: 20 January 2020

16.
Machine Learning: An overview with the help of R software
Book by Editor Ijsmi
Originally published: 20 November 2018

17.
People Analytics & Text Mining with R
Book by Mong Shen Ng
Originally published: 21 March 2019

18.
The Successful Trader Foundation: How To Become The 1% Successful ...
Book by Thang Duc Chu
Originally published: 18 July 2019

19.
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data
Book by Bing Liu
Originally published: 30 May 2007

20.
Deep Learning in Natural Language Processing
Book
Originally published: 23 May 2018
Li Deng, Yang Liu
Springer

21.
Multimodal Sentiment Analysis
Novel by Amir Hussain, Erik Cambria, and Soujanya Poria
Originally published: 24 October 2018

22.
Sentic Computing: Techniques, Tools, and Applications
Novel by Amir Hussain and Erik Cambria
Originally published: 28 July 2012

23.
Semantic Sentiment Analysis in Social Streams
Book by Hassan Saif
Originally published: 2017
Author: Hassan Saif
Genre: Dissertation

24.
Trading on Sentiment: The Power of Minds Over Markets
Book by Richard L. Peterson
Originally published: 2 March 2016

25.
Natural Language Processing with Python
Book by Edward Loper, Ewan Klein, and Steven Bird
Originally published: June 2009

26.
Sentiment Analysis
Book by BLOKDYK. GERARDUS
Originally published: May 2018

27.
Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning
Book by Benjamin Bengfort, Rebecca Bilbro, and Tony Ojeda
Originally published: 2018

28.
Sentiment Analysis in the Bio-Medical Domain: Techniques, Tools, and Applications
Book by Amir Hussain, Erik Cambria, and Ranjan Satapathy
Originally published: 23 January 2018

29.
Sentiment in the Forex Market: Indicators and Strategies To Profit from Crowd Behavior and Market Extremes
Book by Jamie Saettele
Originally published: 2008

30.
Big Data Analytics with Java
Book by Rajat Mehta
Originally published: 28 July 2017

31.
Sentiment Analysis for Social Media
Book
Originally published: 2 April 2020

32.
Applying Sentiment Analysis for Tweets Linking to Scientific Papers
Book by Natalie Friedrich
Originally published: 21 December 2015

33.
A Survey of Sentiment Analysis
Book by Moritz Platt
Originally published: May 2014

34.
Textual Classification for Sentiment Detection. Brand Reputation Analysis on the ...
Book by Mike Nkongolo
Originally published: 10 April 2018

35.
Company Fit: A Decision Support Tool Based on Feature Level Sentiment ...
Book by Akshi Kumar
Originally published: 30 August 2017

36.
KNN Classifier Based Approach for Multi-Class Sentiment Analysis of Twitter Data
Book by Soudamini Hota and Sudhir Pathak
Originally published: 18 October 2017

37.
A Classification Technique for Sentiment Analysis in Data Mining
Book
Originally published: 13 September 2017

38.
Exploration of Competitive Market Behavior Using Near-Real-Time Sentiment Analysis
Book by Norman Peitek
Originally published: 30 December 2014

39.
Sentiment Analysis for PTSD Signals
Book by Demetrios Sapounas, Edward Rossini, and Vadim Kagan
Originally published: 25 October 2013

40.
Lifelong Machine Learning: Second Edition
Book by Bing Liu and Zhiyuan Chen
Originally published: 7 November 2016

41.
Handbook of Natural Language Processing
Book
Originally published: 19 February 2010

42.
Neural Network Methods in Natural Language Processing
Book by Yoav Goldberg
Originally published: 2017

43.
The General Inquirer: A Computer Approach to Content Analysis
Book by Philip James Stone
Originally published: 1966

44.
Plutchik, Robert (1980), Emotion: Theory, research, and experience: Vol. 1. Theories of emotion, 1, New York: Academic

45.
Foundations of Statistical Natural Language Processing
Book by Christopher D. Manning and Hinrich Schütze
Originally published: 1999

46.
Sentiment Analysis: Quick Reference
Book by BLOKDYK. GERARDUS
Originally published: 14 January 2018

47.
Intelligent Asset Management
Book by Erik Cambria, Frank Xing, and Roy Welsch
Originally published: 13 November 2019

48.
Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis
Book by Amir Hussain and Erik Cambria
Originally published: 11 December 2015

49.
Computational Linguistics and Intelligent Text Processing: 18th International Conference, CICLing 2017, Budapest, Hungary, April 17–23, 2017, Revised Selected Papers, Part II
Book
Originally published: 9 October 2018

50.
The SenticNet Sentiment Lexicon: Exploring Semantic Richness in Multi-Word Concepts
Book by Raoul Biagioni
Originally published: 28 May 2016

Sunday, August 23, 2020

Compare Two Files Using 'git diff'


    
Note: Our current directory is not in a Git repository.

We have two files "file_1.txt" and "file_2.txt".

"file_1.txt" has content:

Hermione is a good girl.
Hermione is a bad girl.
Hermione is a very good girl.

"file_2.txt" has content:

Hermione is a good girl.
No, Hermione is not a bad girl.
Hermione is a very good girl.

The color coding below is as it appears in Windows CMD prompt:

C:\Users\Ashish Jain\OneDrive\Desktop>git diff --no-index file_1.txt file_2.txt
diff --git a/file_1.txt b/file_2.txt
index fc04cd5..52bdfd9 100644
--- a/file_1.txt
+++ b/file_2.txt,
@@ -1,3 +1,3 @@
 Hermione is a good girl.
-Hermione is a bad girl.
+No, Hermione is not a bad girl.
 Hermione is a very good girl.
\ No newline at end of file

C:\Users\Ashish Jain\OneDrive\Desktop>git diff --no-index file_2.txt file_1.txt
diff --git a/file_2.txt b/file_1.txt
index 52bdfd9..fc04cd5 100644
--- a/file_2.txt
+++ b/file_1.txt
@@ -1,3 +1,3 @@
 Hermione is a good girl.
-No, Hermione is not a bad girl.
+Hermione is a bad girl.
 Hermione is a very good girl.
\ No newline at end of file

The output of the "git diff" is with respect to the first file.

Output of "git diff file_1.txt file_2.txt" is read as: "What changed in file_1 as we move from file_1 to file_2".

Saturday, August 22, 2020

Using Snorkel to create test data and classifying using Scikit-Learn

The data set we have is the "Iris" dataset. We will augment the dataset to create "test" dataset and then use "Scikit-Learn's Support Vector Machines' classifier 'SVC'" to classify the test points into one of the Iris species.

import pandas as pd
import numpy as np

from snorkel.augmentation import transformation_function
from snorkel.augmentation import RandomPolicy
from snorkel.augmentation import PandasTFApplier

from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

df = pd.read_csv('files_1/datasets_19_420_Iris.csv')

for i in set(df.Species):
    # Other columns are ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max']
    print(i)
    print(df[df.Species == i].describe().loc[['min', '25%', '50%', '75%', 'max'], :])
	


features = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']

classes = ['Iris-setosa', 'Iris-virginica', 'Iris-versicolor']
desc_dict = {}
for i in classes:
    desc_dict[i] = df[df.Species == i].describe()
	
df['Train'] = 'Train'

# random.randint returns a random integer N such that a <= N <= b
@transformation_function(pre = [])
def get_new_instance_for_this_class(x):
    x.SepalLengthCm = np.random.randint(round(desc_dict[x.Species].loc[['25%'], ['SepalLengthCm']].iloc[0,0], 2) * 100, 
                  round(desc_dict[x.Species].loc[['75%'], ['SepalLengthCm']].iloc[0,0], 2) * 100) / 100
    
    x.SepalWidthCm = np.random.randint(round(desc_dict[x.Species].loc[['25%'], ['SepalWidthCm']].iloc[0,0], 2) * 100, 
                  round(desc_dict[x.Species].loc[['75%'], ['SepalWidthCm']].iloc[0,0], 2) * 100) / 100
    
    x.PetalLengthCm = np.random.randint(round(desc_dict[x.Species].loc[['25%'], ['PetalLengthCm']].iloc[0,0], 2) * 100, 
                  round(desc_dict[x.Species].loc[['75%'], ['PetalLengthCm']].iloc[0,0], 2) * 100) / 100
    
    x.PetalWidthCm = np.random.randint(round(desc_dict[x.Species].loc[['25%'], ['PetalWidthCm']].iloc[0,0], 2) * 100, 
                  round(desc_dict[x.Species].loc[['75%'], ['PetalWidthCm']].iloc[0,0], 2) * 100) / 100
    
    x.Train = 'Test'
    return x

tfs = [ get_new_instance_for_this_class ]

random_policy = RandomPolicy(
    len(tfs), sequence_length=2, n_per_original=1, keep_original=True
)

tf_applier = PandasTFApplier(tfs, random_policy)
df_train_augmented = tf_applier.apply(df)

print(f"Original training set size: {len(df)}")
print(f"Augmented training set size: {len(df_train_augmented)}") 

Output:
Original training set size: 150
Augmented training set size: 300 

df_test = df_train_augmented[df_train_augmented.Train == 'Test']

clf = svm.SVC(gamma = 'auto')

clf.fit(df[features], df['Species'])

Output:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
	
pred = clf.predict(df_test[features])

print("Accuracy: {:.3f}".format(accuracy_score(df_test['Species'], pred)))
print("Confusion matrix:\n{}".format(confusion_matrix(df_test['Species'], pred))) 



To confirm that we do not have an overlap in training and testing data:

left = df[features]
right = df_test[features]
print(left.merge(right, on = features, how = 'inner').shape)

Output:
(0, 4)

left = df[['Id']]
right = df_test[['Id']]
print(left.merge(right, on = ['Id'], how = 'inner').shape)

(150, 1)

Friday, August 21, 2020

Using Snorkel, SpaCy to augment text data



We have some data that looks like this in a file "names.csv":

names,text
Harry Potter,Harry Potter is the protagonist.
Ronald Weasley,Ronald Weasley is the chess expert.
Hermione Granger,Hermione is the super witch.
Hermione Granger,Hermione Granger weds Ron.

We augment this data by replacing the names in the "text" column with new randomly selected different names.

For this we write Python code as given below:

import pandas as pd
from collections import OrderedDict
import numpy as np
import names
from snorkel.augmentation import transformation_function

from snorkel.preprocess.nlp import SpacyPreprocessor
spacy = SpacyPreprocessor(text_field="text", doc_field="doc", memoize=True)

df = pd.read_csv('names.csv', encoding='cp1252')
print(df.head())
print()

# Pregenerate some random person names to replace existing ones with for the transformation strategies below
replacement_names = [names.get_full_name() for _ in range(50)]

# Replace a random named entity with a different entity of the same type.
@transformation_function(pre=[spacy])
def change_person(x):
    person_names = [ent.text for ent in x.doc.ents if ent.label_ == "PERSON"]
    # If there is at least one person name, replace a random one. Else return None.
    if person_names:
        name_to_replace = np.random.choice(person_names)
        replacement_name = np.random.choice(replacement_names)
        x.text = x.text.replace(name_to_replace, replacement_name)
        return x

tfs = [ change_person ]

from snorkel.augmentation import RandomPolicy

random_policy = RandomPolicy(
    len(tfs), sequence_length=2, n_per_original=1, keep_original=True
)

from snorkel.augmentation import PandasTFApplier

tf_applier = PandasTFApplier(tfs, random_policy)
df_train_augmented = tf_applier.apply(df)

print(f"Original training set size: {len(df)}")
print(f"Augmented training set size: {len(df_train_augmented)}")

print(df_train_augmented)

print("\nDebugging for 'Hermione':\n")
import spacy 
nlp = spacy.load('en_core_web_sm')   

def format_str(str, max_len = 25):
    str = str + " " * max_len
    return str[:max_len]

for i, row in df.iterrows():
    doc = nlp(row.text)   
    for ent in doc.ents: 
        # print(ent.text, ent.start_char, ent.end_char, ent.label_) 
        print(format_str(ent.text), ent.label_)     

The Snorkel we are running is:
(temp) E:\>conda list snorkel 

# packages in environment at E:\programfiles\Anaconda3\envs\temp:
#
# Name                    Version                   Build  Channel
snorkel                   0.9.3                      py_0    conda-forge 

Now, we run it in "Anaconda Prompt":

(temp) E:\>python script.py
              names                                 text
0      Harry Potter     Harry Potter is the protagonist.
1    Ronald Weasley  Ronald Weasley is the chess expert.
2  Hermione Granger         Hermione is the super witch.
3  Hermione Granger           Hermione Granger weds Ron.

100%|██████████| 4/4 [00:00<00:00, 34.58it/s]
Original training set size: 4
Augmented training set size: 7
              names                                 text
0      Harry Potter     Harry Potter is the protagonist.
0      Harry Potter  Donald Gregoire is the protagonist.
1    Ronald Weasley  Ronald Weasley is the chess expert.
1    Ronald Weasley       John Hill is the chess expert. 
2  Hermione Granger         Hermione is the super witch.
3  Hermione Granger            Hermione Granger weds Ron.
3  Hermione Granger           Jonathan Humphrey weds Ron.

Debugging for 'Hermione':

Harry Potter              PERSON
Ronald Weasley            PERSON
Hermione                  ORG
Hermione Granger          PERSON
Ron                       PERSON

There is an error with the name "Hermione" (the red row above). Upon debugging we see that it is recognized as an 'Organization' and not a 'Person'.

Thursday, August 20, 2020

Using Conda to install and manage packages through YAML file and installing kernel



We are at path: E:\exp_snorkel\

File we are writing: env.yml

CONTENTS OF THE YAML FILE FOR ENVIRONMENT: 

name: temp
channels:
  - conda-forge
dependencies:
  - pip
  - ca-certificates=2020.6.24=0
  - matplotlib=3.1.1
  - pip:
    - names==0.3.0
  - nltk=3.4.5
  - numpy>=1.16.0,<1.17.0
  - pandas>=0.24.0,<0.25.0
  - scikit-learn>=0.20.2
  - spacy>=2.1.6,<2.2.0
  - tensorflow=1.14.0
  - textblob=0.15.3

prefix: E:\programfiles\Anaconda3\envs\temp 

WORKING IN THE CONDA SHELL: 

(base) E:\exp_snorkel>conda remove -n temp --all	
(base) E:\exp_snorkel>conda env create -f env.yml 
Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies.  Conda may not use the correct pip to install your packages, and they may end up in the wrong place.  Please add an explicit pip dependency.  I'm adding one for you, but still nagging you.
Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: | Ran pip subprocess with arguments:
['E:\\programfiles\\Anaconda3\\envs\\temp\\python.exe', '-m', 'pip', 'install', '-U', '-r', 'E:\\exp_snorkel\\condaenv.m7xf9r2n.requirements.txt']
Pip subprocess output:
Collecting names==0.3.0
	Using cached names-0.3.0.tar.gz (789 kB)
Building wheels for collected packages: names
	Building wheel for names (setup.py): started
	Building wheel for names (setup.py): finished with status 'done'
	Created wheel for names: filename=names-0.3.0-py3-none-any.whl size=803694 sha256=318443a7ae55ef0d24be16b374517a61494a29b2a0f0f07d164ab1ef058efb0a
	Stored in directory: c:\users\ashish jain\appdata\local\pip\cache\wheels\05\ea\68\92f6b0669e478af9b7c3c524520d03050089e034edcc775c2b
Successfully built names
Installing collected packages: names
Successfully installed names-0.3.0

done
#
# To activate this environment, use
#
#     $ conda activate temp
#
# To deactivate an active environment, use
#
#     $ conda deactivate 

MANAGING A KERNEL: 

1. Create a kernel:
  (base) E:\exp_snorkel>conda activate temp

  (temp) E:\exp_snorkel>pip install ipykernel jupyter

  (temp) E:\exp_snorkel>python -m ipykernel install --user --name temp

2. To remove a kernel from Jupyter Notebook: [Kernel name is "temp"]
  (base) E:\exp_snorkel>jupyter kernelspec uninstall temp

3. To view all installed kernels:
  (base) E:\exp_snorkel>jupyter kernelspec list
    Available kernels:
	  temp       C:\Users\Ashish Jain\AppData\Roaming\jupyter\kernels\temp
	  python3    E:\programfiles\Anaconda3\share\jupyter\kernels\python3 

Ref: docs.conda.io

Wednesday, August 19, 2020

Technology Listing related to web application security (Aug 2020)

1. Internet Message Access Protocol

In computing, the Internet Message Access Protocol (IMAP) is an Internet standard protocol used by email clients to retrieve email messages from a mail server over a TCP/IP connection. IMAP is defined by RFC 3501.

IMAP was designed with the goal of permitting complete management of an email box by multiple email clients, therefore clients generally leave messages on the server until the user explicitly deletes them. An IMAP server typically listens on port number 143. IMAP over SSL (IMAPS) is assigned the port number 993.

Virtually all modern e-mail clients and servers support IMAP, which along with the earlier POP3 (Post Office Protocol) are the two most prevalent standard protocols for email retrieval. Many webmail service providers such as Gmail, Outlook.com and Yahoo! Mail also provide support for both IMAP and POP3.

Email protocols

The Internet Message Access Protocol is an Application Layer Internet protocol that allows an e-mail client to access email on a remote mail server. The current version is defined by RFC 3501. An IMAP server typically listens on well-known port 143, while IMAP over SSL (IMAPS) uses 993.

Incoming email messages are sent to an email server that stores messages in the recipient's email box. The user retrieves the messages with an email client that uses one of a number of email retrieval protocols. While some clients and servers preferentially use vendor-specific, proprietary protocols, almost all support POP and IMAP for retrieving email – allowing many free choice between many e-mail clients such as Pegasus Mail or Mozilla Thunderbird to access these servers, and allows the clients to be used with other servers.

Email clients using IMAP generally leave messages on the server until the user explicitly deletes them. This and other characteristics of IMAP operation allow multiple clients to manage the same mailbox. Most email clients support IMAP in addition to Post Office Protocol (POP) to retrieve messages. IMAP offers access to the mail storage. Clients may store local copies of the messages, but these are considered to be a temporary cache.

Ref: Wikipedia - IMAP

2. Kerberos (krb5)

Kerberos (/ˈkɜːrbərɒs/) is a computer-network authentication protocol that works on the basis of tickets to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. The protocol was named after the character Kerberos (or Cerberus) from Greek mythology, the ferocious three-headed guard dog of Hades. Its designers aimed it primarily at a client–server model and it provides mutual authentication—both the user and the server verify each other's identity. Kerberos protocol messages are protected against eavesdropping and replay attacks.

Kerberos builds on symmetric key cryptography and requires a trusted third party, and optionally may use public-key cryptography during certain phases of authentication. Kerberos uses UDP port 88 by default.

Ref: Kerberos

3. Securing Java Enterprise Apps (Spring Security)

Spring Security is a Java/Java EE framework that provides authentication, authorization and other security features for enterprise applications.

Key authentication features

% LDAP (using both bind-based and password comparison strategies) for centralization of authentication information.
% Single sign-on capabilities using the popular Central Authentication Service.
% Java Authentication and Authorization Service (JAAS) LoginModule, a standards-based method for authentication used within Java. Note this feature is only a delegation to a JAAS Loginmodule.
% Basic access authentication as defined through RFC 1945.
% Digest access authentication as defined through RFC 2617 and RFC 2069.
% X.509 client certificate presentation over the Secure Sockets Layer standard.
% CA, Inc SiteMinder for authentication (a popular commercial access management product).
% Su (Unix)-like support for switching principal identity over a HTTP or HTTPS connection.
% Run-as replacement, which enables an operation to assume a different security identity.
% Anonymous authentication, which means that even unauthenticated principals are allocated a security identity.
% Container adapter (custom realm) support for Apache Tomcat, Resin, JBoss and Jetty (web server).
% Windows NTLM to enable browser integration (experimental).
% Web form authentication, similar to the servlet container specification.
% "Remember-me" support via HTTP cookies.
% Concurrent session support, which limits the number of simultaneous logins permitted by a principal.
% Full support for customization and plugging in custom authentication implementations.

Ref: Spring Security

4. LDAP (Lightweight Directory Access Protocol)

The Lightweight Directory Access Protocol (LDAP /ˈɛldæp/) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. As examples, directory services may provide any organized set of records, often with a hierarchical structure, such as a corporate email directory. Similarly, a telephone directory is a list of subscribers with an address and a phone number.

LDAP is specified in a series of Internet Engineering Task Force (IETF) Standard Track publications called Request for Comments (RFCs), using the description language ASN.1. The latest specification is Version 3, published as RFC 4511 (a road map to the technical specifications is provided by RFC4510).

A common use of LDAP is to provide a central place to store usernames and passwords. This allows many different applications and services to connect to the LDAP server to validate users.

LDAP is based on a simpler subset of the standards contained within the X.500 standard. Because of this relationship, LDAP is sometimes called X.500-lite.

Ref: LDAP (Lightweight Directory Access Protocol)

5. Keycloak

Keycloak is an open source software product to allow single sign-on with Identity Management and Access Management aimed at modern applications and services. As of March 2018 this JBoss community project is under the stewardship of Red Hat who use it as the upstream project for their RH-SSO product.

Features: Among the many features of Keycloak include :

% User Registration
% Social login
% Single Sign-On/Sign-Off across all applications belonging to the same Realm
% 2-factor authentication
% LDAP integration
% Kerberos broker
% multitenancy with per-realm customizeable skin

Components: There are 2 main components of Keycloak:

% Keycloak server
% Keycloak application adapter

Ref: Keycloak

6. OAuth

OAuth is an open standard for access delegation, commonly used as a way for Internet users to grant websites or applications access to their information on other websites but without giving them the passwords. This mechanism is used by companies such as Amazon, Google, Facebook, Microsoft and Twitter to permit the users to share information about their accounts with third party applications or websites.

Generally, OAuth provides clients a "secure delegated access" to server resources on behalf of a resource owner. It specifies a process for resource owners to authorize third-party access to their server resources without sharing their credentials. Designed specifically to work with Hypertext Transfer Protocol (HTTP), OAuth essentially allows access tokens to be issued to third-party clients by an authorization server, with the approval of the resource owner. The third party then uses the access token to access the protected resources hosted by the resource server.

OAuth is a service that is complementary to and distinct from OpenID. OAuth is unrelated to OATH, which is a reference architecture for authentication, not a standard for authorization. However, OAuth is directly related to OpenID Connect (OIDC), since OIDC is an authentication layer built on top of OAuth 2.0. OAuth is also unrelated to XACML, which is an authorization policy standard. OAuth can be used in conjunction with XACML, where OAuth is used for ownership consent and access delegation whereas XACML is used to define the authorization policies (e.g., managers can view documents in their region).

Ref: OAuth

7. OpenID

OpenID is an open standard and decentralized authentication protocol. Promoted by the non-profit OpenID Foundation, it allows users to be authenticated by co-operating sites (known as relying parties, or RP) using a third-party service, eliminating the need for webmasters to provide their own ad hoc login systems, and allowing users to log into multiple unrelated websites without having to have a separate identity and password for each. Users create accounts by selecting an OpenID identity provider and then use those accounts to sign onto any website that accepts OpenID authentication. Several large organizations either issue or accept OpenIDs on their websites, according to the OpenID Foundation.

The OpenID standard provides a framework for the communication that must take place between the identity provider and the OpenID acceptor (the "relying party"). An extension to the standard (the OpenID Attribute Exchange) facilitates the transfer of user attributes, such as name and gender, from the OpenID identity provider to the relying party (each relying party may request a different set of attributes, depending on its requirements). The OpenID protocol does not rely on a central authority to authenticate a user's identity. Moreover, neither services nor the OpenID standard may mandate a specific means by which to authenticate users, allowing for approaches ranging from the common (such as passwords) to the novel (such as smart cards or biometrics).

The final version of OpenID is OpenID 2.0, finalized and published in December 2007. The term OpenID may also refer to an identifier as specified in the OpenID standard; these identifiers take the form of a unique Uniform Resource Identifier (URI), and are managed by some "OpenID provider" that handles authentication.

OpenID vs OAuth:

The Top 10 OWASP vulnerabilities in 2020 are:

% Injection
% Broken Authentication
% Sensitive Data Exposure
% XML External Entities (XXE)
% Broken Access control
% Security misconfigurations
% Cross Site Scripting (XSS)
% Insecure Deserialization
% Using Components with known vulnerabilities
% Insufficient logging and monitoring

Ref (a): OpenID
Ref (b): Top 10 Security Vulnerabilities in 2020

8. Heroku

Heroku is a cloud platform as a service (PaaS) supporting several programming languages. One of the first cloud platforms, Heroku has been in development since June 2007, when it supported only the Ruby programming language, but now supports Java, Node.js, Scala, Clojure, Python, PHP, and Go. For this reason, Heroku is said to be a polyglot platform as it has features for a developer to build, run and scale applications in a similar manner across most languages. Heroku was acquired by Salesforce.com in 2010 for $212 million.

Ref: Heroku

9. Facebook Prophet

Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

Prophet is open source software released by Facebook’s Core Data Science team. It is available for download on CRAN and PyPI.

% CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R. Link: https://cran.r-project.org/

The Prophet procedure includes many possibilities for users to tweak and adjust forecasts. You can use human-interpretable parameters to improve your forecast by adding your domain knowledge.

Ref: Facebook.github.io

10. Snorkel

% Programmatically Build Training Data

The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI application development platform based on the core ideas behind Snorkel.

The Snorkel project started at Stanford in 2016 with a simple technical bet: that it would increasingly be the training data, not the models, algorithms, or infrastructure, that decided whether a machine learning project succeeded or failed. Given this premise, we set out to explore the radical idea that you could bring mathematical and systems structure to the messy and often entirely manual process of training data creation and management, starting by empowering users to programmatically label, build, and manage training data.

To say that the Snorkel project succeeded and expanded beyond what we had ever expected would be an understatement. The basic goals of a research repo like Snorkel are to provide a minimum viable framework for testing and validating hypotheses.

Snorkel Related innovations are in weak supervision modeling, data augmentation, multi-task learning, and more.

The ideas behind Snorkel change not just how you label training data, but so much of the entire lifecycle and pipeline of building, deploying, and managing ML: how users inject their knowledge; how models are constructed, trained, inspected, versioned, and monitored; how entire pipelines are developed iteratively; and how the full set of stakeholders in any ML deployment, from subject matter experts to ML engineers, are incorporated into the process.

Over the last year, we have been building the platform to support this broader vision: Snorkel Flow, an end-to-end machine learning platform for developing and deploying AI applications. Snorkel Flow incorporates many of the concepts of the Snorkel project with a range of newer techniques around weak supervision modeling, data augmentation, multi-task learning, data slicing and structuring, monitoring and analysis, and more, all of which integrate in a way that is greater than the sum of its parts–and that we believe makes ML truly faster, more flexible, and more practical than ever before.

Ref: snorkel.org

Tuesday, August 18, 2020

Data Science Timeline (Aug 2020)

Monday, August 17, 2020

Differences between Conda and Pip installation. Installing TensorFlow 2.1.0 using Conda on Windows 10 And 'Hello World' program.

TensorFlow is a Python library for high-performance numerical calculations that allows users to create sophisticated deep learning and machine learning applications.
There are a number of methods that can be used to install TensorFlow, such as using pip to install the wheels available on PyPI. Installing TensorFlow using conda packages offers a number of benefits, including a complete package management system, wider platform support, a more streamlined GPU experience, and better CPU performance. These packages are available via the Anaconda Repository, and installing them is as easy as running “conda install tensorflow” or “conda install tensorflow-gpu” from a command line interface.

One key benefit of installing TensorFlow using conda rather than pip is a result of the conda package management system. When TensorFlow is installed using conda, conda installs all the necessary and compatible dependencies for the packages as well. This is done automatically; users do not need to install any additional software via system packages managers or other means.

Like other packages in the Anaconda repository, TensorFlow is supported on a number of platforms. TensorFlow conda packages are available for Windows, Linux, and macOS. The Linux packages for the 1.10.0 release support a number of Linux distributions including older distributions such as CentOS 6. This is a further benefit of the conda packages: in spite of being labeled as manylinux1-compatible (works on many versions of linux), the wheels available on PyPI support only a minimum of Ubuntu 16.04, which is much newer than many enterprise Linux installations.

The conda TensorFlow packages are also designed for better performance on CPUs through the use of the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN). Starting with version 1.9.0, the conda TensorFlow packages are built using the Intel® MKL-DNN library, which demonstrates considerable performance improvements. For example, Figure 1 compares the performance of training and inference on two different image classification models using TensorFlow installed using conda verses the same version installed using pip. The performance of the conda installed version is over eight times the speed of the pip installed package in many of the benchmarks.



Figure 1: Training performance of TensorFlow on a number of common deep learning models using synthetic data. Benchmarks were performed on an Intel® Xeon® Gold 6130.

Interested in trying out these TensorFlow packages? After installing Anaconda or Miniconda, create a new conda environment containing TensorFlow and activate it

$ conda create -n tensorflow_env tensorflow
$ conda activate tensorflow_env

Or for the GPU version

$ conda create -n tensorflow_gpuenv tensorflow-gpu
$ conda activate tensorflow_gpuenv

TensorFlow is now installed and ready for use. For those new to TensorFlow, the tutorials offer a great place to get started.

Ref: Anaconda Blog


Implementation 

(base) C:\Users\ashish>conda env list
# conda environments:
#
base                  *  D:\programfiles\Anaconda3
py38                     D:\programfiles\Anaconda3\envs\py38

(base) C:\Users\ashish>pip install tensorflow==
ERROR: Could not find a version that satisfies the requirement tensorflow== (from versions: 1.13.0rc1, 1.13.0rc2, 1.13.1, 1.13.2, 1.14.0rc0, 1.14.0rc1, 1.14.0, 1.15.0rc0, 1.15.0rc1, 1.15.0rc2, 1.15.0rc3, 1.15.0, 1.15.2, 1.15.3, 2.0.0a0, 2.0.0b0, 2.0.0b1, 2.0.0rc0, 2.0.0rc1, 2.0.0rc2, 2.0.0, 2.0.1, 2.0.2, 2.1.0rc0, 2.1.0rc1, 2.1.0rc2, 2.1.0, 2.1.1, 2.2.0rc0, 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0)
ERROR: No matching distribution found for tensorflow==

(base) C:\Users\ashish>conda create -n tf
Collecting package metadata (current_repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
current version: 4.8.3
latest version: 4.8.4

Please update conda by running

$ conda update -n base -c defaults conda

## Package Plan ##

environment location: D:\programfiles\Anaconda3\envs\tf

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate tf
#
# To deactivate an active environment, use
#
#     $ conda deactivate


(base) C:\Users\ashish>conda activate tf

(tf) C:\Users\ashish>pip install tensorflow==2.0.1
'pip' is not recognized as an internal or external command,
operable program or batch file.

(tf) C:\Users\ashish>conda install tensorflow==2.0.1
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

- tensorflow==2.0.1

Current channels:

- https://repo.anaconda.com/pkgs/main/win-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/win-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/msys2/win-64
- https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page. 

(tf) C:\Users\ashish>conda deactivate 

(base) C:\Users\ashish>conda remove --name tf --all 

Remove all packages in environment D:\programfiles\Anaconda3\envs\tf:

No packages found in D:\programfiles\Anaconda3\envs\tf. Continuing environment removal 

(base) C:\Users\ashish>conda env list 
# conda environments:
#
base                  *  D:\programfiles\Anaconda3
py38                     D:\programfiles\Anaconda3\envs\py38 

(base) C:\Users\ashish>conda create -n tf tensorflow 
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
current version: 4.8.3
latest version: 4.8.4

Please update conda by running

$ conda update -n base -c defaults conda

## Package Plan ##

environment location: D:\programfiles\Anaconda3\envs\tf

added / updated specs:
- tensorflow

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
_tflow_select-2.2.0        |            eigen           3 KB
absl-py-0.9.0              |           py37_0         168 KB
astor-0.8.1                |           py37_0          47 KB
blinker-1.4                |           py37_0          22 KB
brotlipy-0.7.0             |py37he774522_1000         336 KB
ca-certificates-2020.6.24  |                0         125 KB
cachetools-4.1.1           |             py_0          12 KB
certifi-2020.6.20          |           py37_0         156 KB
click-7.1.2                |             py_0          71 KB
cryptography-2.9.2         |   py37h7a1dbc1_0         523 KB
gast-0.2.2                 |           py37_0         155 KB
google-auth-1.20.1         |             py_0          55 KB
google-auth-oauthlib-0.4.1 |             py_2          20 KB
google-pasta-0.2.0         |             py_0          46 KB
grpcio-1.27.2              |   py37h351948d_0         1.2 MB
idna-2.10                  |             py_0          50 KB
importlib-metadata-1.7.0   |           py37_0          52 KB
keras-applications-1.0.8   |             py_1          29 KB
keras-preprocessing-1.1.0  |             py_1          37 KB
libprotobuf-3.12.4         |       h200bbdf_0         1.8 MB
markdown-3.2.2             |           py37_0         136 KB
mkl_fft-1.1.0              |   py37h45dec08_0         116 KB
mkl_random-1.1.1           |   py37h47e9c7a_0         233 KB
numpy-1.19.1               |   py37h5510c5b_0          22 KB
numpy-base-1.19.1          |   py37ha3acd2a_0         3.8 MB
oauthlib-3.1.0             |             py_0          91 KB
openssl-1.1.1g             |       he774522_1         4.8 MB
opt_einsum-3.1.0           |             py_0          54 KB
pip-20.2.2                 |           py37_0         1.7 MB
protobuf-3.12.4            |   py37ha925a31_0         555 KB
pyasn1-0.4.8               |             py_0          57 KB
pyasn1-modules-0.2.7       |             py_0          68 KB
pycparser-2.20             |             py_2          94 KB
pyjwt-1.7.1                |           py37_0          49 KB
pyopenssl-19.1.0           |             py_1          48 KB
pysocks-1.7.1              |           py37_1          28 KB
python-3.7.7               |       h81c818b_4        14.3 MB
requests-2.24.0            |             py_0          56 KB
requests-oauthlib-1.3.0    |             py_0          23 KB
rsa-4.6                    |             py_0          26 KB
scipy-1.5.0                |   py37h9439919_0        11.8 MB
setuptools-49.6.0          |           py37_0         771 KB
sqlite-3.32.3              |       h2a8f88b_0         802 KB
tensorboard-2.2.1          |     pyh532a8cf_0         2.4 MB
tensorboard-plugin-wit-1.6.0|             py_0         630 KB
tensorflow-2.1.0           |eigen_py37hd727fc0_0           4 KB
tensorflow-base-2.1.0      |eigen_py37h49b2757_0        35.4 MB
tensorflow-estimator-2.1.0 |     pyhd54b08b_0         251 KB
termcolor-1.1.0            |           py37_1           8 KB
urllib3-1.25.10            |             py_0          98 KB
vs2015_runtime-14.16.27012 |       hf0eaf9b_3         1.2 MB
werkzeug-0.16.1            |             py_0         258 KB
wrapt-1.12.1               |   py37he774522_1          49 KB
zipp-3.1.0                 |             py_0          13 KB
------------------------------------------------------------
                            Total:        84.8 MB

The following NEW packages will be INSTALLED:

_tflow_select      pkgs/main/win-64::_tflow_select-2.2.0-eigen
absl-py            pkgs/main/win-64::absl-py-0.9.0-py37_0
astor              pkgs/main/win-64::astor-0.8.1-py37_0
blas               pkgs/main/win-64::blas-1.0-mkl
blinker            pkgs/main/win-64::blinker-1.4-py37_0
brotlipy           pkgs/main/win-64::brotlipy-0.7.0-py37he774522_1000
ca-certificates    pkgs/main/win-64::ca-certificates-2020.6.24-0
cachetools         pkgs/main/noarch::cachetools-4.1.1-py_0
certifi            pkgs/main/win-64::certifi-2020.6.20-py37_0
cffi               pkgs/main/win-64::cffi-1.14.0-py37h7a1dbc1_0
chardet            pkgs/main/win-64::chardet-3.0.4-py37_1003
click              pkgs/main/noarch::click-7.1.2-py_0
cryptography       pkgs/main/win-64::cryptography-2.9.2-py37h7a1dbc1_0
gast               pkgs/main/win-64::gast-0.2.2-py37_0
google-auth        pkgs/main/noarch::google-auth-1.20.1-py_0
google-auth-oauth~ pkgs/main/noarch::google-auth-oauthlib-0.4.1-py_2
google-pasta       pkgs/main/noarch::google-pasta-0.2.0-py_0
grpcio             pkgs/main/win-64::grpcio-1.27.2-py37h351948d_0
h5py               pkgs/main/win-64::h5py-2.10.0-py37h5e291fa_0
hdf5               pkgs/main/win-64::hdf5-1.10.4-h7ebc959_0
icc_rt             pkgs/main/win-64::icc_rt-2019.0.0-h0cc432a_1
idna               pkgs/main/noarch::idna-2.10-py_0
importlib-metadata pkgs/main/win-64::importlib-metadata-1.7.0-py37_0
intel-openmp       pkgs/main/win-64::intel-openmp-2020.1-216
keras-applications pkgs/main/noarch::keras-applications-1.0.8-py_1
keras-preprocessi~ pkgs/main/noarch::keras-preprocessing-1.1.0-py_1
libprotobuf        pkgs/main/win-64::libprotobuf-3.12.4-h200bbdf_0
markdown           pkgs/main/win-64::markdown-3.2.2-py37_0
mkl                pkgs/main/win-64::mkl-2020.1-216
mkl-service        pkgs/main/win-64::mkl-service-2.3.0-py37hb782905_0
mkl_fft            pkgs/main/win-64::mkl_fft-1.1.0-py37h45dec08_0
mkl_random         pkgs/main/win-64::mkl_random-1.1.1-py37h47e9c7a_0
numpy              pkgs/main/win-64::numpy-1.19.1-py37h5510c5b_0
numpy-base         pkgs/main/win-64::numpy-base-1.19.1-py37ha3acd2a_0
oauthlib           pkgs/main/noarch::oauthlib-3.1.0-py_0
openssl            pkgs/main/win-64::openssl-1.1.1g-he774522_1
opt_einsum         pkgs/main/noarch::opt_einsum-3.1.0-py_0
pip                pkgs/main/win-64::pip-20.2.2-py37_0
protobuf           pkgs/main/win-64::protobuf-3.12.4-py37ha925a31_0
pyasn1             pkgs/main/noarch::pyasn1-0.4.8-py_0
pyasn1-modules     pkgs/main/noarch::pyasn1-modules-0.2.7-py_0
pycparser          pkgs/main/noarch::pycparser-2.20-py_2
pyjwt              pkgs/main/win-64::pyjwt-1.7.1-py37_0
pyopenssl          pkgs/main/noarch::pyopenssl-19.1.0-py_1
pyreadline         pkgs/main/win-64::pyreadline-2.1-py37_1
pysocks            pkgs/main/win-64::pysocks-1.7.1-py37_1
python             pkgs/main/win-64::python-3.7.7-h81c818b_4
requests           pkgs/main/noarch::requests-2.24.0-py_0
requests-oauthlib  pkgs/main/noarch::requests-oauthlib-1.3.0-py_0
rsa                pkgs/main/noarch::rsa-4.6-py_0
scipy              pkgs/main/win-64::scipy-1.5.0-py37h9439919_0
setuptools         pkgs/main/win-64::setuptools-49.6.0-py37_0
six                pkgs/main/noarch::six-1.15.0-py_0
sqlite             pkgs/main/win-64::sqlite-3.32.3-h2a8f88b_0
tensorboard        pkgs/main/noarch::tensorboard-2.2.1-pyh532a8cf_0
tensorboard-plugi~ pkgs/main/noarch::tensorboard-plugin-wit-1.6.0-py_0
tensorflow         pkgs/main/win-64::tensorflow-2.1.0-eigen_py37hd727fc0_0
tensorflow-base    pkgs/main/win-64::tensorflow-base-2.1.0-eigen_py37h49b2757_0
tensorflow-estima~ pkgs/main/noarch::tensorflow-estimator-2.1.0-pyhd54b08b_0
termcolor          pkgs/main/win-64::termcolor-1.1.0-py37_1
urllib3            pkgs/main/noarch::urllib3-1.25.10-py_0
vc                 pkgs/main/win-64::vc-14.1-h0510ff6_4
vs2015_runtime     pkgs/main/win-64::vs2015_runtime-14.16.27012-hf0eaf9b_3
werkzeug           pkgs/main/noarch::werkzeug-0.16.1-py_0
wheel              pkgs/main/win-64::wheel-0.34.2-py37_0
win_inet_pton      pkgs/main/win-64::win_inet_pton-1.1.0-py37_0
wincertstore       pkgs/main/win-64::wincertstore-0.2-py37_0
wrapt              pkgs/main/win-64::wrapt-1.12.1-py37he774522_1
zipp               pkgs/main/noarch::zipp-3.1.0-py_0
zlib               pkgs/main/win-64::zlib-1.2.11-h62dcd97_4


Proceed ([y]/n)? y

...

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate tf
#
# To deactivate an active environment, use
#
#     $ conda deactivate 

(base) C:\Users\ashish>conda activate tf

(tf) C:\Users\ashish>pip show tensorflow 
Name: tensorflow
Version: 2.1.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: d:\programfiles\anaconda3\envs\tf\lib\site-packages
Requires: opt-einsum, six, gast, keras-preprocessing, wrapt, numpy, tensorboard, keras-applications, absl-py, google-pasta, termcolor, wheel, protobuf, scipy, grpcio, tensorflow-estimator, astor
Required-by: 

In TensorFlow 1.X, we would have written some code like this to check TensorFlow installation:

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello)) 

Out: b'Hello, TensorFlow!'

Ref: StackOverflow

This won't work in TensorFlow 2.X. With TensorFlow 2.X, you would get the error message: "AttributeError: module 'tensorflow' has no attribute 'session'" for line "sess = tf.Session()".

The TF2.x "hello world" program would be like this:

import tensorflow as tf
msg = tf.constant('Hello, TensorFlow!')
tf.print(msg)

Out: Hello, TensorFlow!

(tf) C:\Users\ashish>python
Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'2.1.0'
>>> hello = tf.constant('Hello, TensorFlow!')
2020-08-17 12:59:21.446424: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

>>> hello 
[tf.Tensor: shape=(), dtype=string, numpy=b'Hello, TensorFlow!'] 

>>> print(hello)
tf.Tensor(b'Hello, TensorFlow!', shape=(), dtype=string)

>>> msg = tf.constant('Hello, TensorFlow!')
>>> tf.print(msg)
Hello, TensorFlow!
>>>

>>> tf.print(hello)
Hello, TensorFlow! 

Reason for this is: 

TF2 runs Eager Execution by default, thus removing the need for Sessions. If you want to run static graphs, the more proper way is to use tf.function() in TF2. While Session can still be accessed via tf.compat.v1.Session() in TF2, I would discourage using it.

References:

% TensorFlow.Org

% StackOverflow: AttributeError: module 'tensorflow' has no attribute 'session'

Sunday, August 16, 2020

Failing Installation of TensorFlow (2.3.0-cp37) on Windows 10 in Anaconda Prompt

On Windows 10 (in Aug 2020):

(base) C:\Users\Ashish Jain>conda create -n tf

(base) C:\Users\Ashish Jain>conda activate tf

(tf) C:\Users\Ashish Jain>pip install tensorflow 

Collecting tensorflow
  Downloading https://files.pythonhosted.org/packages/c6/dc/9030097e5774fe02d1be3cb42eb54125d2c0607a6c11172f1dcad2b7fdcc/tensorflow-2.3.0-cp37-cp37m-win_amd64.whl (342.5MB)
  
...

Collecting numpy less than 1.19.0, >=1.16.0 (from tensorflow)
  Downloading https://files.pythonhosted.org/packages/e4/01/7a26148f7de9eb6c27f95b29eba16b7e820827cb9ebaae182d7483e44711/numpy-1.18.5-cp37-cp37m-win_amd64.whl (12.7MB)
    100% |████████████████████████████████| 12.7MB 706kB/s  

...

Installing collected packages: termcolor, numpy, keras-preprocessing, astunparse, absl-py, setuptools, tensorboard-plugin-wit, markdown, grpcio, tensorboard, google-pasta, scipy, opt-einsum, tensorflow-estimator, h5py, wrapt, gast, tensorflow
  Found existing installation: numpy 1.15.4
    Uninstalling numpy-1.15.4:
Could not install packages due to an EnvironmentError: [WinError 5] Access is denied: 'e:\\programfiles\\anaconda3\\lib\\site-packages\\numpy\\core\\multiarray.cp37-win_amd64.pyd'
Consider using the `--user` option or check the permissions. 

(tf) C:\Users\Ashish Jain>pip install tensorflow --user

...
  Installing collected packages: grpcio, numpy, absl-py, markdown, setuptools, tensorboard-plugin-wit, tensorboard, gast, tensorflow-estimator, h5py, opt-einsum, astunparse, wrapt, google-pasta, keras-preprocessing, scipy, tensorflow
  The script f2py.exe is installed in 'C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The script markdown_py.exe is installed in 'C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The script tensorboard.exe is installed in 'C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The scripts estimator_ckpt_converter.exe, saved_model_cli.exe, tensorboard.exe, tf_upgrade_v2.exe, tflite_convert.exe, toco.exe and toco_from_protos.exe are installed in 'C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed absl-py-0.9.0 astunparse-1.6.3 gast-0.3.3 google-pasta-0.2.0 grpcio-1.31.0 h5py-2.10.0 keras-preprocessing-1.1.2 markdown-3.2.2 numpy-1.18.5 opt-einsum-3.3.0 scipy-1.4.1 setuptools-49.6.0 tensorboard-2.3.0 tensorboard-plugin-wit-1.7.0 tensorflow-2.3.0 tensorflow-estimator-2.3.0 wrapt-1.12.1 

Added "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\Scripts" to the PATH variable.

(tf) C:\Users\Ashish Jain>python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32

>>> import tensorflow as tf 

Traceback (most recent call last):
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\pywrap_tensorflow.py", line 64, in [module]
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "[stdin]", line 1, in [module]
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\__init__.py", line 41, in [module]
    from tensorflow.python.tools import module_util as _module_util
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\__init__.py", line 40, in [module]
    from tensorflow.python.eager import context
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\context.py", line 35, in [module]
    from tensorflow.python import pywrap_tfe
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\pywrap_tfe.py", line 28, in [module]
    from tensorflow.python import pywrap_tensorflow
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\pywrap_tensorflow.py", line 83, in [module]
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "C:\Users\Ashish Jain\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\pywrap_tensorflow.py", line 64, in [module]
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed: The specified module could not be found.


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help. 
>>>

Saturday, August 15, 2020

The problem with perfectionists, and traits of a perfectionist

# People often brag about being perfectionists – but new research shows people much prefer colleagues with realistic expectations.

When you hear the word ‘perfectionist’, someone may spring to mind nearly instantly – a boss, colleague or even work friend whose standards have almost nothing to do with reality. They await the impossible from themselves or others, put in hours and hours making tweaks invisible to anyone but themselves, then wind up burnt out and exhausted by the end of the week.

Often these people will even advertise this trait, announcing brightly: “I’m a bit of a perfectionist”. It’s a boast of sorts, and a way to differentiate themselves as a star employee. After all, who wouldn’t want to hire someone who strives for perfection?

The answer may not be a resounding ‘yes’. Increasingly, research suggests that perfectionism isn’t a professional trait you necessarily want to advertise. It can actually negatively affect the workplace environment, alienate colleagues and make it harder for teams to get along. Forthcoming research from psychologists Emily Kleszewski and Kathleen Otto, from Germany’s Philipps University of Marburg, suggests that perfectionists might be far from the ideal, or even preferred, colleague to work with.

“If colleagues could choose between working with a perfectionist or a non-perfectionist,” says Kleszewski, “they would always prefer the non-perfectionist – the person with realistic expectations for themselves, and also for the team.”

# Although perfectionism can permeate every corner of a person’s life, it’s rife in professional contexts

And while perfectionism can permeate every corner of a person’s life, it’s rife in professional contexts, she says. “If you ask people in what domain they are perfectionists, the most frequent answer is always the workplace. There's a lot of performance and evaluation inherent in the tasks.” Research has tended to focus on perfectionists’ actual output, rather than the effect it might have on team climate or interpersonal relationships. But it’s worth investigating, says Kleszewski: “We know from previous research that good team climate is important for mental wellbeing at work.”

The timing is right for the research: there’s evidence perfectionism is on the rise. A 2018 analysis from British researchers Andrew Hill and Thomas Curran investigated more than 40,000 college students’ answers to a “perfectionism scale” questionnaire, compiled between 1986 and 2015. The results were clear: young people are far more likely to be perfectionists than their predecessors. Recent college students, whether millennials or generation Z, perceive others as expecting more from them, while simultaneously having higher expectations of themselves and those around them.

Is perfectionism any good?

Before about 1910, ‘perfectionism’ was generally used to describe a niche theological viewpoint. In the past century or so, it’s come to describe a particular worldview: someone who avoids error on a personal crusade for flawlessness.

# If given the choice, colleagues would almost always choose working with a non-perfectionist.

Initially, many psychologists thought perfectionism was wholly negative and deeply neurotic. In 1950, the German psychoanalyst Karen Horney described perfectionists as being terrorised by the “tyranny of the should” – that they felt they “should” be any number of contradictory ideals, able to solve any problem, complete impossible tasks and so on. Telling a patient they expected too much of themselves tended to be fruitless, she wrote: “He will usually add, explicitly or implicitly, that it is better to expect too much of himself than too little.”

In the decades since, academic opinion has become a little more conciliatory. On the one hand, perfectionism seems to be closely correlated with mental-health difficulties, including depression, anxiety and eating disorders. Professionally speaking, it can equate to burnout and stress, as expecting the impossible may mean setting yourself up for failure. On the other hand, perfectionists have been found to be more motivated and conscientious than their non-perfectionist peers, both highly desirable traits in an employee.

In a best-case scenario, perfectionists successfully channel their high standards into doing great work – while cutting themselves and others some slack when things don’t go perfectly.

# Even with all of the downsides of perfectionism, perfectionists have been found to be more motivated and conscientious than their non-perfectionist peers.

But such a balance isn’t always so easy to strike. In Kleszewski and Otto’s study, perfectionists and non-perfectionists were asked to rank potential colleagues for desirability, and to describe their experiences of getting along with others at work. Perfectionists were overwhelmingly described as highly able, but hard to get along with, while non-perfectionists topped the ratings for social skills and how much people wanted to work with them, even if they weren’t considered as competent. Perfectionists seem to notice a little coolness from their peers: the study showed that many described feeling excluded or on the edge of team dynamics.

Different approaches

These days, most researchers agree that perfectionism comes in many different forms, some of which may be more harmful than others.

# Perfectionists seem to notice a little coolness from their peers: one study showed that many described feeling excluded or on the edge of team dynamics

One well-accepted definition splits perfectionists into three groups. You might be a “self-oriented perfectionist”, who sets very high standards for just yourself; a “socially prescribed perfectionist”, who believes that the acceptance of others is dependent on your own perfection; or an “other-oriented perfectionist”, who expects flawlessness from those around them. Each type has their own strengths and weaknesses – and some are more harmful to a team dynamic than others. (Kleszewski and Otto’s study showed that perfectionists who limit their quest for excellence to their own work are far easier to get along with than those who expect a lot of those around them.)

A vast meta-analysis of 30 years of studies, conducted at the University of Pennsylvania’s Wharton Business School, explored another commonly-used classification system: “excellence-seeking” and “failure-avoiding”. The first kind of perfectionist fixates on achievingexcessively high standards; the second is obsessed with not making mistakes. While both groups exhibited some of the downsides of perfectionism, including workaholism, anxiety and burnout, they were especially true of the “failure avoiding” perfectionists, who also were more likely not to be “agreeable”.

# Perfectionism can equate to burnout and stress, since expecting the impossible may mean setting yourself up for failure – at work or otherwise

Even though perfectionists may be undesirable colleagues, perhaps surprisingly, there was no relationship between perfectionism and job performance for either group, says researcher Dana Harari, who worked on the meta-analysis. “To me, the most important takeaway of this research is the null relationship between perfectionism and performance,” she says. “It's not positive, it's not negative, it's just really null.”

Your perfectionist colleague may be setting themselves up for failure – especially when it comes to getting along with others. Research suggests that by throwing all their weight at one task, they may inadvertently neglect others along the way, or miss the value of maintaining positive relationships with their co-workers. People who manage perfectionists, meanwhile, should encourage them to invest a little less in their work and a little more in their own wellbeing.

And if you’ve read this with a sinking sense of guilt about your own workplace behaviour, go easy on yourself. No one’s perfect, after all.

Ref: BBC

Friday, August 14, 2020

What will it take for AI to become mainstream in business (ML Evolution)

What will it take for AI to become mainstream in business? The convergence of different research approaches—and lots of human intelligence.

We’re in the midst of a breakthrough decade for artificial intelligence (AI): More sophisticated neural networks paired with sufficient voice recognition training data brought Amazon Echo and Google Home into scores of households. Deep learning’s improved accuracy in image, voice, and other pattern recognition have made Bing Translator and Google Translate go-to services. And enhancements in image recognition have made Facebook Picture Search and the AI in Google Photos possible. Collectively, these have put machine recognition capabilities in the hands of consumers in a big way.

What will it take to make similar inroads in business? Quality training data, digital data processing, and data science expertise. It will also require a lot of human intelligence, such as language-savvy domain experts who refine computable, logically consistent business context to allow logical reasoning. Business leaders will have to take the time to teach machines and incorporate machine intelligence into more processes, starting with narrow domains.

Some in the statistically oriented machine learning research “tribes”—the Connectionists, the Bayesians and the Analogizers, for example —will worry that “human-in-the-loop” methods advocated by the Symbolists aren’t scalable. However, we expect these human-to-machine feedback loops, that blend methods of several tribes, will become a lot more common inside the enterprise over the next few years.

See what that evolution might look like below. For an overview of machine learning, see the first infographic in our series. And for a better understanding of how its algorithms are used, see our machine learning methods infographic.

...

Time Series Analysis Books (Aug 2020)

Google Search String: "time series books"

1. Time series analysis and its applications
Textbook by Robert H. Shumway
Year: 2000

2. Time series analysis, forecasting and control
Book by George E. P. Box
Year: 1970

3. The analysis of time series
Book by Christopher Chatfield
Year: 1975

4. Analysis of Financial Time Series
Book by Ruey S. Tsay
Year: 2002

5. Practical Time Series Analysis: Prediction with Statistics and Machine Learning
Book by Aileen Nielsen
Year: 2019

6. Time Series Analysis
Book by James D. Hamilton
Year: 1994

7. Time Series Analysis: Univariate and Multivariate Methods
Book by William W. S. Wei
Year: 1990

8. Introduction to multiple time series analysis
Textbook by Helmut Lütkepohl
Year: 1991

9. Introductory Time Series with R
Book by Andrew Metcalfe and Paul S.P. Cowpertwait
Year: 2009

10. Forecasting and time series analysis
Book by Douglas Montgomery
Year: 1976

11. Time Series Analysis: With Applications in R
Book by Jonathan Cryer and Kung-sik Chan
Year: 2008

12. Time series
Book by Peter J. Brockwell and Richard A. Davis
Year: 1987

13. Forecasting: Principles and Practice
Book by George Athanasopoulos and Rob J. Hyndman
Year: 2013

14. Forecasting with Dynamic Regression Models
Book by Alan Pankratz
Year: 1991

15. Unit roots, cointegration, and structural change
Book by G. S. Maddala
Year: 1998

16. Forecasting, Structural Time Series Models and the Kalman Filter
Book by Andrew C. Harvey
Year: 1989

17. Nonlinear time series analysis
Book by Holger Kantz
Year: 1997

18. Applied Time Series Analysis: A Practical Guide to Modeling and Forecasting
Book by Terence C. Mills
Year: 2019

19. Multivariate Time Series Analysis and Applications
Book by William W. S. Wei
Year: 2018

20. Time Series: A Data Analysis Approach Using R
Book by David S. Stoffer and Robert Shumway
Year: 1990

21. Time-series forecasting
Book by Christopher Chatfield
Year: 2000

22. Introduction to time series and forecasting
Book by Peter J. Brockwell and Richard A. Davis
Year: 1996

23. Hands-On Time Series Analysis with R: Perform Time Series Analysis and Forecasting Using R
Book by Rami Krispin
Year: 2019

24. Introduction to Time Series Using Stata
Book by Sean Becketti
Year: 2013

25. Introductory Econometrics for Finance
Textbook by Chris Brooks
Year: 2002

26. Time Series Econometrics: Learning Through Replication
Book by John D. Levendis
Year: 2019

27. Time Series Analysis
Book by Henrik Madsen
Year: 2007

28. Practical Time Series Analysis: Master Time Series Data Processing, Visualization, and Modeling Using Python
Book by Avishek Pal and PKS. Prakash
Year: 2017

29. Multivariate Time Series Analysis: With R and Financial Applications
Book by Ruey S. Tsay
Year: 2013

30. Introduction to Time Series Forecasting With Python: How to Prepare Data and Develop Models to Predict the Future
Book by Jason Brownlee
Year: 2017

31. Introduction to Time Series Analysis
Book by Mark Pickup
Year: 2014