def compare_dict(x, y): shared_items = {k: x[k] for k in x if k in y and x[k] == y[k]} differing_values = {k: x[k] for k in x if k in y and x[k] != y[k]} differing_keys = [k for k in x if k not in y] return { "shared_items_in_x": shared_items, "differing_values_in_x": differing_values, "differing_keys_in_x": differing_keys } def is_dict_in_list(d, l): rtn = False for k in l: cd = compare_dict(d, k) if(len(cd['differing_values_in_x']) == 0 and len(cd['differing_keys_in_x']) == 0): rtn = True break return rtn def purify_list_of_dicts(inlist): nlist = [i for j in inlist for i in j] olist = [] for i in range(0, len(nlist)): if is_dict_in_list(nlist[i], olist) == False: olist.append(nlist[i]) return olist
Tuesday, August 23, 2022
Compare two dictionaries in Python
Thursday, August 11, 2022
Using Sentiment to Detect Bots on Twitter : Are Humans more Opinionated than Bots (Dickerson, Jul 2022)
Download Research Paper
Tags: Natural Language ProcessingAbstract
In many Twitter applications, developers collect only a limited sample of tweets and a local portion of the Twitter network. Given such Twitter applications with limited data, how can we classify Twitter users as either bots or humans? We develop a collection of network-, linguistic-, and application oriented variables that could be used as possible features, and identify specific features that distinguish well between humans and bots. In particular, by analyzing a large dataset relating to the 2014 Indian election, we show that a number of sentiment related factors are key to the identification of bots, significantly increasing the Area under the ROC Curve (AUROC). The same method may be used for other applications as well.A. Previous Work
There has been recent interest in the detection of malicious and/or fake users from both the online social networks and computer networking communities. # For instance, Wang [4] looks at graph-based features to identify bots on Twitter, while Yang, Harkreader, and [4] A. H. Wang, “Detecting spam bots in online social networking sites: A machine learning approach,” in Conference on Data and Applications Security and Privacy. ACM, 2010, pp. 335–342. # Gu [5] combine similar graphbased features with syntactic metrics to build their classifiers. [5] C. Yang, R. C. Harkreader, and G. Gu, “Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers,” in Recent Advances in Intrusion Detection. Springer, 2011, pp. 318–337. # Thomas et al. [6] use a similar set of features to provide a retrospective analysis of a large set of recently-suspended Twitter accounts. [6] K. Thomas, C. Grier, D. Song, and V. Paxson, “Suspended accounts in retrospect: An analysis of Twitter spam,” in Internet Measurement Conference (IMC). ACM, 2011, pp. 243–258. # Boshmaf et al. [7] instead create bots (rather than detecting them), claiming that 80% of bots are undetectable and that Facebook’s Immune system [8] was unable to detect their bots. [7] Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, “The socialbot network: When bots socialize for fame and money,” in Annual Computer Security Applications Conference (ACSAC). ACM, 2011, pp. 93–102. [8] T. Stein, E. Chen, and K. Mangla, “Facebook immune system,” in Workshop on Social Network Systems (SNS). ACM, 2011. # Lee, Caverlee, and Webb [9] create “honeypot” accounts to lure both humans and spammers into the open, then provide a statistical analysis of the malicious accounts they identified. [9] K. Lee, J. Caverlee, and S. Webb, “Uncovering social spammers: Social honeypots + machine learning,” in Annual ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2010, pp. 435–442. # In computer networks research, the detection of Sybil accounts in computer networks has been applied to social network data; these techniques tend to rely on the “fast mixing” property of a network—which may not exist in social networks [10]—and do not scale to the size of present-day social networks (e.g., SybilInfer [3] runs in time O(|V|^2 . log |V|), which is intractable for networks with millions users). [10] A. Mohaisen, A. Yun, and Y. Kim, “Measuring the mixing time of social graphs,” in Internet Measurement Conference (IMC). ACM, 2010, pp. 383–389.V. CONCLUSION
In many real-world applications, developers are only able to collect tweets from the Twitter API that directly address a set of topics of interest (TOI) relevant to the application. Moreover, in such applications, developers also typically only collect a local portion of the Twitter network. As a consequence, many traditional primarily network-based methods for detecting bots are less or not effective (e.g., if the topics are quite specific, not discussed by very popular people, or not retweeted much), since a sparse subset of the global network and tweet database based on a set TOI is insufficient. The SentiBot framework presented in this paper addresses the classification of users as human versus bot in such applications. In order to achieve this, SentiBot relies on four classes of variables (or features) related to tweet syntax, tweet semantics, user behavior, and network-centric user properties. In particular, we introduce a large set of sentiment variables, including combinations of sentiment and network variables— to our knowledge, this is the first time such sentiment-based features have been used in bot detection. In addition, we introduce variables related to topics of interest. We apply a suite of classical machine learning algorithms to identify: (i) users who are bots and (ii) TOI-independent features that are particularly important in distinguishing between bots and humans. Based on an analysis of over 7.7 million tweets and 550,000 users associated with the recently concluded 2014 Indian election (where there were reports of social media campaigns), we were able to show that the use of sentiment variables significantly improved the accuracy of our classification. In particular, the Area under the ROC Curve (AUROC) increased from 0.65 to 0.73. As an AUROC of 0.5 represents random guessing, this reflects 53% improvement in accuracy. In addition, we discovered that (in our dataset): 1) Bots flip-flop much less frequently than humans in terms of sentiment; 2) When humans express positive sentiment, they tend to express stronger positive sentiment than bots; 3) A similar (but slightly more nuanced) trend holds in terms of expression of negative sentiments by humans; and 4) Humans disagree more with the general sentiment of the application's Twitter population than bots. Our results can feed into many applications. For instance, when assessing which Twitter users are influential on a given topic, we must discount for bots—which requires methods like those presented in this paper to identify bots. When identifying the expected spread of a sentiment through Twitter, we again must discount for bots. The paper presents a general framework within which applications can identify bots using the relatively limited local data they have.
Fluoxetine (SSRI (Selective Serotonin Reuptake Inhibitor))
Fluoxetine, sold under the brand names Prozac and Sarafem, among others, is an antidepressant of the selective serotonin reuptake inhibitor class. It is used for the treatment of major depressive disorder, obsessive–compulsive disorder, bulimia nervosa, panic disorder, and premenstrual dysphoric disorder.Tags: Medicine,Fluoxetine Uses
Fluoxetine is used in the treatment of depression, Panic disorder and obsessive-compulsive disorder.How Fluoxetine works
Fluoxetine is a selective serotonin reuptake inhibitor (SSRI) antidepressant. It works by increasing the levels of serotonin, a chemical messenger in the brain. This improves mood and physical symptoms of depression and also relieves symptoms of panic and obsessive disorders.Common side effects of Fluoxetine
Weakness, Insomnia (difficulty in sleeping), Nervousness, Anxiety, Blurred vision, Decreased libido, Fatigue, Frequent urge to urinate, Gastrointestinal disturbance, Headache, Palpitations, Prolonged QT intervalComposition
Fluoxetine: 20 mg Capsule
Wednesday, August 10, 2022
Classification of Twitter Accounts into Automated Agents and Human Users (Zafar Gilani, Jul 2022)
Download Research Paper
Tags: Natural Language ProcessingAbstract
Online social networks (OSNs) have seen a remarkable rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks (such as Twitter) facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into ‘automated’ and ‘human’ users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection research tool. We then apply a Random Forests classifier that achieves an accuracy close to human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results.Index Terms
Social network analysis; account classification; automated agents; bot detectionOur work has the following contributions:
(i) Use of raw historical data (60 million tweets) for attribute collection and account classification (722; 109 tweets) to cater for stealthier agents that are harder to discern from humans; (ii) A Twitter dataset divided into user popularity bands, further partitioned into lists of agents and humans (for reasons refer to xIV) using a human annotation task. This serves as a large ground truth dataset; (iii) 14 novel features from a total feature-set of 21 attributes (see xIV); (iv) Performance evaluation of current state of the art in bot detection by calculating agreement between human annotators and BOTORNOT; (v) Application of supervised learning approach – Random Forests classifier – for non-partisan account categorisation; (vi) Identification of a distinct group of features (using ablation tests) that are most informative for classifying automated agents within each popularity band (cf. Table VIII); and (vii) Hypotheses (cf. Table I) verification against our findings using t-tests (see xVI).Infotainment
References
12: Datasets can be found here – https://goo.gl/SigsQB. Classifier is available as a part of Stweeler. The link is forbidden for public.
Monday, August 8, 2022
Accessing Twitter API From Two Systems. One With Firewall and Second Without Firewall
This note is less about accessing Twitter API but more about Cyber Security where you run a curl command and based on the output from that command you try to figure out the firewall settings of the system. System 1 Configuration With Strict Firewall Where Our Curl Command For Accessing Twitter API is Not Working: (base) C:\Users\ash\Desktop>systeminfo OS Name: Microsoft Windows 10 Enterprise OS Version: 10.0.19042 N/A Build 19042 Processor(s): 1 Processor(s) Installed. [01]: AMD64 Family 23 Model 24 Stepping 1 AuthenticAMD ~2100 Mhz BIOS Version: HP R79 Ver. 01.10.03, 3/24/2020 Network Card(s): 4 NIC(s) Installed. [01]: Realtek RTL8822BE 802.11ac PCIe Adapter Connection Name: Wi-Fi DHCP Enabled: Yes DHCP Server: 192.168.1.1 IP address(es) [01]: 192.168.1.100 [02]: fe80::b1b2:6d59:f669:1b96 [03]: 2401:4900:47f1:b174:70f4:de28:6287:b1c9 [04]: 2401:4900:47f1:b174:b1b2:6d59:f669:1b96 [02]: Realtek PCIe GbE Family Controller Connection Name: Ethernet Status: Media disconnected [03]: Bluetooth Device (Personal Area Network) Connection Name: Bluetooth Network Connection Status: Media disconnected [04]: Check Point Virtual Network Adapter For Endpoint VPN Client Connection Name: Ethernet 2 DHCP Enabled: Yes DHCP Server: 10.79.251.145 IP address(es) [01]: 10.79.251.146 [02]: fe80::3df2:2a4:b2e1:cb0 Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes System 2 Without Strict Firewall Where Curl Command is Working: C:\Users\Ashish Jain>systeminfo Host Name: LAPTOP-79RV456R OS Name: Microsoft Windows 10 Home Single Language OS Version: 10.0.19043 N/A Build 19043 OS Manufacturer: Microsoft Corporation OS Configuration: Standalone Workstation OS Build Type: Multiprocessor Free Registered Owner: Ashish Jain Registered Organization: Product ID: 00327-35105-52167-AAOEM Original Install Date: 3/14/2021, 6:33:25 AM System Boot Time: 7/14/2022, 5:34:13 PM System Manufacturer: LENOVO System Model: 81H7 System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 78 Stepping 3 GenuineIntel ~2000 Mhz BIOS Version: LENOVO 8QCN26WW(V1.14), 12/29/2020 Windows Directory: C:\WINDOWS System Directory: C:\WINDOWS\system32 Boot Device: \Device\HarddiskVolume1 System Locale: en-us;English (United States) Input Locale: 00004009 Time Zone: (UTC+05:30) Chennai, Kolkata, Mumbai, New Delhi Total Physical Memory: 12,154 MB Available Physical Memory: 7,634 MB Virtual Memory: Max Size: 14,010 MB Virtual Memory: Available: 8,057 MB Virtual Memory: In Use: 5,953 MB Page File Location(s): C:\pagefile.sys Domain: WORKGROUP Logon Server: \\LAPTOP-79RV456R Hotfix(s): 15 Hotfix(s) Installed. [01]: KB5013887 [02]: KB4562830 [03]: KB4577586 [04]: KB4580325 [05]: KB4589212 [06]: KB5000736 [07]: KB5015807 [08]: KB5006753 [09]: KB5007273 [10]: KB5011352 [11]: KB5011651 [12]: KB5014032 [13]: KB5014035 [14]: KB5014671 [15]: KB5005699 Network Card(s): 4 NIC(s) Installed. [01]: VirtualBox Host-Only Ethernet Adapter Connection Name: VirtualBox Host-Only Network DHCP Enabled: No IP address(es) [01]: 192.168.56.1 [02]: fe80::f839:dc84:9a7b:3087 [02]: Realtek 8821CE Wireless LAN 802.11ac PCI-E NIC Connection Name: Wi-Fi Status: Media disconnected [03]: Realtek PCIe FE Family Controller Connection Name: Ethernet Status: Media disconnected [04]: Bluetooth Device (Personal Area Network) Connection Name: Bluetooth Network Connection Status: Media disconnected Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes C:\Users\Ashish Jain> I was able to make a successful request from System 2: (base) C:\Users\Ashish Jain>curl "https://api.twitter.com/2/users/by/username/vantagepoint21" -H "Authorization: Bearer A***V" {"data":{"id":"96529689","name":"Ashish Jain","username":"vantagepoint21"}} (base) C:\Users\Ashish Jain>curl "https://api.twitter.com/2/users/by/username/elonmusk" -H "Authorization: Bearer A***V" {"data":{"id":"44196397","name":"Elon Musk","username":"elonmusk"}} The curl command is not working on the System 1. I think there is some issue being created by Network Firewall settings in my office laptop. From which I was not able to get a response from Twitter API. (base) C:\Users\ash\Desktop\twitter_api>curl "https://api.twitter.com/2/users/by/username/vantagepoint21" -H "Authorization: Bearer 9***2" curl: (35) schannel: next InitializeSecurityContext failed: Unknown error (0x80092012) - The revocation function was unable to check revocation for the certificate. On further testing the "curl" command on 'System 1' for URLs with "http" and "https" protocols: (base) C:\Users\ash\Desktop>curl www.survival8.blogspot.com <HTML> <HEAD> <TITLE>Moved Permanently</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF" TEXT="#000000"> <H1>Moved Permanently</H1> The document has moved <A HREF="http://survival8.blogspot.com/">here</A>. </BODY> </HTML>Success for HTTP based URL
--- (base) C:\Users\ash\Desktop>curl https://survival8.blogspot.com curl: (35) schannel: next InitializeSecurityContext failed: Unknown error (0x80092012) - The revocation function was unable to check revocation for the certificate. (base) C:\Users\ash\Desktop>curl https://survival8.blogspot.com/2022/08/lets-talk-about-whataboutery.html curl: (35) schannel: next InitializeSecurityContext failed: Unknown error (0x80092012) - The revocation function was unable to check revocation for the certificate.Failure for HTTPS based URL.
---Successful Testing With Another HTTP based URL:
(base) C:\Users\ash\Desktop>curl http://survival8.blogspot.com/2022/08/lets-talk-about-whataboutery.html <!DOCTYPE html> <html class='v2' dir='ltr' lang='en'> <head> <link href='https://www.blogger.com/static/v1/widgets/2975350028-css_bundle_v2.css' rel='stylesheet' type='text/css'/> <meta content='width=1100' name='viewport'/> <meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/> <meta content='blogger' name='generator'/> <link href='http://survival8.blogspot.com/favicon.ico' rel='icon' type='image/x-icon'/> <link href='http://survival8.blogspot.com/2022/08/lets-talk-about-whataboutery.html' rel='canonical'/> <link rel="alternate" type="application/atom+xml" title="survival8 - Atom" href="http://survival8.blogspot.com/feeds/posts/default" /> <link rel="alternate" type="application/rss+xml" title="survival8 - RSS" href="http://survival8.blogspot.com/feeds/posts/default?alt=rss" /> <link rel="service.post" type="application/atom+xml" title="survival8 - Atom" href="https://draft.blogger.com/feeds/7823701911930369175/posts/default" /> <link rel="alternate" type="application/atom+xml" title="survival8 - Atom" href="http://survival8.blogspot.com/feeds/1169952638388485943/comments/default" /> <!--Can't find substitution for tag [blog.ieCssRetrofitLinks]--> <meta content='http://survival8.blogspot.com/2022/08/lets-talk-about-whataboutery.html' property='og:url'/> <meta content='Let’s talk about ‘Whataboutery’' property='og:title'/> <meta content=' what·about·ery [ˌwɒtəˈbaʊtəri] NOUN BRITISH the technique or practice of responding to an accusation or dif...' property='og:description'/> <title>survival8: Let’s talk about ‘Whataboutery’</title> <style id='page-skin-1' type='text/css'><!-- /* ----------------------------------------------- Blogger Template Style Name: Simple Designer: Blogger URL: www.blogger.com ----------------------------------------------- */ /* Content ----------------------------------------------- */ body { ...Also, note that if that was Authorization failure from Twitter API, then the output would still be a JSON format informative message:
(base) C:\Users\Ashish Jain>curl "https://api.twitter.com/2/users/by/username/elonmusk" -H "Authorization: Bearer 9***INCORRECT_BEARER_TOKEN***2" { "title": "Unauthorized", "type": "about:blank", "status": 401, "detail": "Unauthorized" }On a Side Note: Take a look at another error message from Twitter API:
(base) C:\Users\Ashish Jain>curl "https://api.twitter.com/2/users/by/username/elonmusj" -H "Authorization: Bearer A***V" { "errors": [ { "parameter":"username", "resource_id":"elonmusj", "value":"elonmusj", "detail":"User has been suspended: [elonmusj].", "title":"Forbidden", "resource_type":"user", "type":"https://api.twitter.com/2/problems/resource-not-found" } ] } Notice the typo in Elon Musk's user handle we provided in query: elonmusj
Sunday, August 7, 2022
Diclogem Tablet (Diclofenac (50mg) + Paracetamol (325mg))
Diclogem Tablet Prescription Required Manufacturer: Omega Pharmaceuticals Pvt Ltd SALT COMPOSITION: Diclofenac (50mg) + Paracetamol (325mg) Storage: Store below 30°CTags: Medicine,Product introduction
Diclogem Tablet is a pain-relieving medicine. It is used to reduce pain and inflammation in conditions like rheumatoid arthritis, ankylosing spondylitis, and osteoarthritis. It may also be used to relieve muscle pain, back pain, toothache, or pain in the ear and throat. Diclogem Tablet should be taken with food. This will prevent you from getting an upset stomach. You should take it regularly as advised by your doctor. Do not take more or use it for a longer duration than recommended by your doctor. Some of the common side effects of this medicine include nausea, vomiting, stomach pain, loss of appetite, heartburn, and diarrhea. If any of these side effects bother you or do not go away with time, you should let your doctor know. Your doctor may help you with ways to reduce or prevent the side effects. The medicine may not be suitable for everybody. Before taking it, let your doctor know if you have any problems with your heart, kidneys, liver, or have stomach ulcers. To make sure it is safe for you, let your doctor know about all the other medicines you are taking. Pregnant and breastfeeding mothers should first consult their doctors before using this medicine.Uses of Diclogem Tablet
Pain reliefBenefits of Diclogem Tablet
In Pain relief Diclogem Tablet is a combination of medicines that is used for short-term relief of pain, inflammation and swelling. It inhibits release of those chemical messengers in the brain that tell us that we have pain. It effectively relieves back pain, earache, throat pain, toothache and pain due to arthritis too. Take it as it is prescribed to get the most benefit. Do not take more or for longer than needed as that can be dangerous. In general, you should take the lowest dose that works, for the shortest possible time. This will help you to go about your daily activities more easily and have a better, more active, quality of life.Side effects of Diclogem Tablet
Most side effects do not require any medical attention and disappear as your body adjusts to the medicine. Consult your doctor if they persist or if you’re worried about them: Common side effects of Diclogem Nausea Vomiting Stomach pain/epigastric pain Heartburn Diarrhea Loss of appetiteFact Box
Habit Forming : No Therapeutic Class : PAIN ANALGESICS
Saturday, August 6, 2022
Calcitas - D3 Soft Gelatin Capsule
Calcitas - D3 Soft Gelatin Capsule Manufacturer: Intas Pharmaceuticals LtdTags: Medicine,Information about Calcitas - D3 Soft Gelatin Capsule
Calcitas D3 Capsule contains Cholecalciferol 60,000 iu (International units). Cholecalciferol (Vitamin D3) is a fat soluble vitamin, that helps the body to absorb calcium and phosphorous found in food and supplements. Vitamin D is made by the body when skin is exposed to sunlight. Sunscreen, protective clothing, limited exposure to sunlight, dark skin, and age may prevent getting enough vitamin D from the sun, thus leading to Vitamin D3 Deficiency. Thus, Vitamin D3 in Calcitas D3 Capsule is essential for calcium absorption in the body. --- Cholecalciferol is a dietary supplement that is used to treat vitamin D deficiency. It is also used with calcium to maintain bone strength. This medicine is available both over-the-counter (OTC) and with your doctor's prescription. --- Cholecalciferol, also known as vitamin D₃ and colecalciferol, is a type of vitamin D that is made by the skin when exposed to sunlight; it is found in some foods and can be taken as a dietary supplement. Cholecalciferol is made in the skin following UVB light exposure. --- Other uses of Calcitas D3 Capsule are: Building and keeping the bones & teeth strong Reducing Fatigue/stress and muscular pains Boosting immunity and increasing resistance against infection Supplement for patients with diabetic complications and Cardio Vascular Diseases as well. Use under medical supervision.
Tuesday, August 2, 2022
Chatbot Examples in Use in Different Business Domains
Tags: Natural Language Processing,The Apollo 11 Mission
Apollo 11 (July 16 - 24, 1969) was the American spaceflight that first landed humans on the Moon. Commander Neil Armstrong and lunar module pilot Buzz Aldrin landed the Apollo Lunar Module Eagle on July 20, 1969, at 20:17 UTC, and Armstrong became the first person to step onto the Moon's surface six hours and 39 minutes later, on July 21 at 02:56 UTC. Aldrin joined him 19 minutes later, and they spent about two and a quarter hours together exploring the site they had named Tranquility Base upon landing. Armstrong and Aldrin collected 47.5 pounds (21.5 kg) of lunar material to bring back to Earth as pilot Michael Collins flew the Command Module Columbia in lunar orbit, and were on the Moon's surface for 21 hours, 36 minutes before lifting off to rejoin Columbia. Apollo 11 had a lunar system designed for geologists to answer their questions asked in natural language. The geologists would ask questions like "what is the average basalt content" and the system would respond back.Chatbots in Healthcare
Chatbots like Molly, Eva, Ginger, Replika, Florence, and Izzy are widely used in healthcare.Chatbots for mental health support
Bots like Wysa and Woebot are designed in such a way that they can provide support like a life coach. They are so good at asking right probing questions that can help the user to share their emotions and feelings after a hard day.Chatbots for legal advice
Lawyers can use bots like DonotPay, LISA, Ross, and BillyBot to accelerate their work and provide better client experiences.Other Chatbot applications
In Smart keyboards like Swiftkey, the software automatically completes your sentences by predicting the next word and corrects your spelling mistakes. Applications like Grammarly can automatically correct your spelling and grammar and assists you in writing better essays or emails. Dated: 2022-Aug-02
Thursday, July 28, 2022
Natural Language Processing Questions and Answers (Set 4 of 7 Questions)
Course: INTRODUCTION TO NATURAL LANGUAGE PROCESSING Q1: Multiple Choice Correct Which of the following are potential use cases of NLP? a) A self driving car drawing your attentioin to an advertising billboard b) Given the audio of a song, and its lyrics generate a translated song audio c) Understanding a cryptic language d) Determing what are the chances that you will win a law suit based on outcomes of previous similar law suits. Answer: All four are correct. Q2: Multiple Choice Correct Which of the below tasks can be performed effectively even without using sophisticated NLP techniques: a) Identifying the main topic of a document assuming that its title is not provided. b) Detecting the language in a document c) Extracting the phone numer, email address and year of graduatioin from a resume. d) Substituting words like doesn't, can't, etc with does not, and can not, etc. Answer: C and D Q3: Spam email is a persistent problem that service providers have been trying to solve for years now. One of the key tasks in building an effective spam detection system is identifying the features of an email that could be used to classify the email as spam or not. Rank the following features based on the text content of an email based on your Understanding of the feature's importance. a) Language (English, French, etc) used in the email text. b) Presence of words with spelling mistakes / non standard form. c) Emails addressed to you and contain your name. Answer: Correct order is: C > A > B Q4) Identify the kind of ambiguity in the given sentences: a) Time flies like an arrow, fruit flies like a banana. b) Iraqi head seeks arms. c) A frog thought it saw a prince walk towards it. It thought it can't be true. List of ambiuities for matching with above sentences. I) Anaphoric Ambiguity II) Semantic Ambiguity III) Syntactic Ambiguity. Answer: A -> III B -> II C -> I Syntactic ambiguity Take a look at the sentence given below “Old men and women were taken to safe locations” This sentence has a syntactic ambiguity where the scope of the adjective “old” needs to be resolved. In this sentence, we may not know if the adjective applies only to men or to both men and women. Semantic ambiguity Semantic ambiguity refers to ambiguity in the meaning. For example, the sentence “Alice loves her mother and so does Jacob.” The ambiguity here is, we may not know if Jacob loves his own mother or Alice’s mother. Anaphoric Ambiguity In the below paragraph “The horse ran up the hill. It was very steep. It soon got tired.” In this paragraph, the pronoun ‘it’ is used to refer to the hill first and then to the horse. To interpret this sentence, we need to have knowledge of the world and context. These ambiguities are called anaphoric ambiguities. Q5) Consider the below review for co-sleeper sheets for a baby. What is the sentiment in this review? "The shipping was quick the colors are pretty but the sheets themselves are not soft." a) positive b) negative c) Neutral Amswer: Positive The user is appreciating the shipping and the colors. Q6) Do sentiment analysis of following sentence: "The parking was great, the restaurant anbience was good. But the food was utterly terrible." a) positive b) negative c) Neutral Answer: Although the number of positive words is greater than the number of negative words in these sentences, the overall sentiment was negative. Weighted Scores to Find The Polarity The short coming of this dictionary based, and weighted scores for doing Sentiment Analysis is that it misses out on the order of words and hence may classify the sentiment as wrong. Q7) Assume that you have to build an NLP application that looks at a new document and estimates how similar it is to various text documents previously ingested. Consider that similarity of 2 documents is computed on the basis of presence of common words. Based on your understanding of the NLP technique discussed so far, what are various basic pre-processing steps that you will include in this application while processing the historic data and making inferences on a new document? Steps: a. Remove any unwanted spaces, numbers, special characters, etc b. Convert all text into lower case. c. Create n-grams based on the text. d. Tokenize the text. e. Normalize data using stemming and lemmatization techniques. f. Determine the frequence of each word in each document and also in the whole corpus. g. Remove stop words from the text. h. Remove punctuation i. Perform POS tagging on the text. Options: I. All the steps listed above need to be done. II. a, b, d, f, g, h III. b, c, d, e, h, g IV. a, d, e, f, g Answer: IITags: Natural Language Processing
20220728 - Monitoring Effects of 1 tablet of Trini Calm and 1 tablet of Petril Beta 10
Index of Journals
20220728 1910: 1 Tablet of Trinicalm Plus SALT COMPOSITION: Trifluoperazine (5mg) + Trihexyphenidyl (2mg) 1 Tablet of Petril Beta 10 Tablet SALT COMPOSITION: Clonazepam (0.25mg) + Propranolol (10mg) Note: 1. Trihexyphenidyl is also referred to as "THP" medical prescriptions for psychiatric cases. 2. Clonazepam is also known as Clazzy in the underworld of drugs. 1914: Shiva Patel has just come for Math tuition. 1918: My psychiatrist told me that: Propranolol is used to slow down racing heart beat an effect of facing a threatening situation. 2015: Finished teaching students. 2016: Having dinner. 2024: Going for shower. 2037: Am feeling sleepy and tired. Going for rest for an hour. 2040: Spoke to Anjali Devi's parents about NIOS (National Institute of Open Schooling) and readmitting her to study again. 2021: Going for rest. 8:52 pm: I cannot stop thinking how Rekha bua, Manju bua, and Kumkum bua are becoming a blocker in rental business. 8:54 pm: They do not understand that I purchased the flat after having a verbal fight with mom. Mom and I cannot live together. 9:32 pm: Self awareness was there but that panicky, irritated mood was not there. 2202: When I am in Mayur Vihar, I face harassment by uncle and aunt. And, when I am Tri Nagar, I face harassment by three buas.Tags: Medicine,Psychology,
Student Update (2022-Jul-28)
Index of Journals
Tags: Student Update,Counting
Srishti Patel Class: Nursery Till: 8 Anjali Devi Class: 5 Till: 9Tables
Sonam Patel Class: 7 Till: 12 Shiva Patel Class: 6C Till: 18Addition
Sonam Patel Class: 7 Till Level: 4 Shiva Patel Class: 6C Till Level: 9Subtraction
Sonam Patel Class: 7 Till Level: 8
Types of Ambiguities in Natural Language
Tags: Natural Language Processing,Lexical ambiguity
Take a look at the following sentences: John bagged two silver medals. Mary made a silver speech. Roger’s worries had silvered his hair. The word silver is used as a noun, an adjective, and a verb. The word silver in isolation is mostly associated with the metal and considered as a noun. However, in other sentences, the context gives the word silver different meanings and also different parts of speech like adjectives and verbs. This ambiguity is called lexical ambiguity.Syntactic ambiguity
Take a look at the sentence given below “Old men and women were taken to safe locations” This sentence has a syntactic ambiguity where the scope of the adjective “old” needs to be resolved. In this sentence, we may not know if the adjective applies only to men or to both men and women.Semantic ambiguity
Semantic ambiguity refers to ambiguity in the meaning. For example, the sentence “Alice loves her mother and so does Jacob.” The ambiguity here is, we may not know if Jacob loves his own mother or Alice’s mother.Anaphoric ambiguity
In the below paragraph “The horse ran up the hill. It was very steep. It soon got tired.” In this paragraph, the pronoun ‘it’ is used to refer to the hill first and then to the horse. To interpret this sentence, we need to have knowledge of the world and context. These ambiguities are called anaphoric ambiguities.Pragmatic Ambiguity
The hardest kind of ambiguity to resolve is the pragmatic ambiguity. This kind of ambiguity arises from the inability to process the intention or sentiment or world belief. For example, in the below conversation, My wife said: "Please go to the store and buy a carton of milk and if they have eggs, get six." I came back with 6 cartons of milk She said, "why did you buy six cartons of milk?" I replied, "They had eggs" As you can see here, the ambiguity is in understanding the intention of the speaker.
Wednesday, July 27, 2022
Risperidone (Salt) from 1mg.com
Tags: Medicine,PsychologyRisperidone Uses
Risperidone is used in the treatment of schizophrenia and mania.How Risperidone works
Risperidone is an atypical antipsychotic. It works by affecting the levels of chemical messengers (dopamine and serotonin) to improve mood, thoughts and behavior.Common side effects of Risperidone
Insomnia (difficulty in sleeping), Parkinsonism, Sedation, Dizziness, Weight gain, Akathisia (inability to stay still), Anxiety, Gastrointestinal symptom, Increased prolactin level in blood.EXPERT ADVICE FOR RISPERIDONE
1. Risperidone helps treat schizophrenia and mania. 2. It may cause less weight gain, sedation, and heart problems as compared to other similar medicines. 3. It may take 4-6 weeks to notice any medication effects. Keep taking it as prescribed. 4. Use caution while driving or doing anything that requires concentration as Risperidone can cause dizziness and sleepiness. 5. It may cause increase in weight, blood sugar, cholesterol, and fat. Eat healthy, exercise, and monitor your levels regularly. 6. Inform your doctor if you experience any abnormal movements or restlessness. 7. Inform your doctor if you have a history of heart diseases as Risperidone can increase your risk of irregular heartbeat. 8. Do not stop taking Risperidone without talking to your doctor first as it may cause worsening of symptoms.
Student Update (2022-Jul-27)
Index of Journals
Tags: Student Update,Counting
Komal Kumari Class: 4 Trial 1 (Beginning of class): Till: 16 Trial 2 (After an hour): Till: 20 Srishti Patel Class: Nursery Till: 1Tables
Kusum Kumari Class: 5 Till: 2Addition
Kusum Kumari Class: 5 Level: 7Subtraction
Kusum Kumari Class: 5 Level: 1 URL: https://survival8.blogspot.com/2022/01/add-subtract-multiply-divide.html
Tuesday, July 26, 2022
Detailed Solution to Upto Three Digit Subtraction
Note: We are going to subtract the smaller number from the bigger one.
Enter two numbers between 0 to 999.
First Number:
Second Number:
0 0 0 0
0 0 0 0
-
------------
Monday, July 25, 2022
Student Update (2022-Jul-25)
Index of Journals
Tags: Student Update,Counting
Komal Kumari Class: 4th Till: 16 Srishti Patel Class: Nursery Till: 10Tables
Kusum Kumari Class: 5B Till: 3 Yash Kashyap Class: 5 Till: 8Addition
Kusum Kumari Class: 5B Till Level: 4Subtraction
Kusum Kumari Class: 5B Till Level: 1 Yash Kashyap Class: 5 Till Level: 2
Sunday, July 24, 2022
Converting image to text, saving to disk, reading text from disk and displaying image
A brief introduction of 'base64' functions 'b64encode' and 'b64decode': (base) C:\Users\Ashish Jain>python Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from base64 import b64encode as b, b64decode as d >>> s = 'hello' >>> b(bytes(s, 'utf-8')) b'aGVsbG8=' >>> bs = b(bytes(s, 'utf-8')) >>> d(bs) b'hello' >>> d(b'aGVsbG8=') b'hello' >>> d(bs).decode("utf-8") 'hello' Now with image: from base64 import b64decode, b64encode image_handle = open('test_image.png', 'rb') raw_image_data = image_handle.read() encoded_data = b64encode(raw_image_data) with open('i.txt', 'wb') as f: f.write(encoded_data) with open('i.txt', 'rb') as f: b = f.read() print(type(b)) [class 'bytes'] print(encoded_data == b) True with open('i.png', 'wb') as f: f.write(b64decode(b)) If you have a text file and it has data such as this: b'iVB...ggg==' That means you had called str() function on 'bytes' type data and saved that string. If you have a text file that has data such as this: iVB...ggg== Then, you can read this file as ">>> with open('img.txt', 'rb') as f:" to get a 'bytes' type data.Tags: Technology,Python,
Subscribe to:
Posts (Atom)