See All: Miscellaneous Interviews @ Accenture
📘 Accenture Skill Proficiency Test - NLP - Question Report
🧠SECTION 1: NLP & WORD EMBEDDINGS
Q1. What is GloVe?
Extracted options (interpreted):
-
Matrix factorization of (raw) PMI values with respect to squared loss
-
Matrix factorization of log-counts with respect to weighted squared loss
-
Neural network that predicts words in context and learns from that
-
Neural network that predicts words based on similarity and embedding
✅ Correct Answer
✔ Matrix factorization of log-counts with respect to weighted squared loss
💡 Hint
-
GloVe = Global Vectors
-
Combines count-based methods (co-occurrence matrix) with predictive embedding ideas
-
Objective minimizes weighted squared error between word vectors and log(co-occurrence counts)
🧠SECTION 2: TRANSFORMERS & CONTEXTUAL MODELS
Q2. In which architecture are relationships between all words in a sentence modeled irrespective of their position?
Extracted options:
-
OpenAI GPT
-
BERT
-
ULMFiT
-
ELMo
✅ Correct Answer
✔ BERT
💡 Hint
-
BERT uses bidirectional self-attention
-
Every token attends to all other tokens simultaneously
-
GPT = causal (left-to-right), not fully bidirectional
📊 SECTION 3: EVALUATION METRICS
Q3. Log loss evaluation metric can have negative values
Options:
-
True
-
False
✅ Correct Answer
✔ False
💡 Hint
-
Log loss = negative log likelihood
-
Probabilities ∈ (0,1) → log(probability) ≤ 0 → negative sign makes loss ≥ 0
-
Log loss is always ≥ 0
Q4. Which metric is used to evaluate STT (Speech-to-Text) transcription?
Extracted options:
-
ROUGE
-
BLEU
-
WER
-
METEOR
✅ Correct Answer
✔ WER (Word Error Rate)
💡 Hint
-
WER = (Insertions + Deletions + Substitutions) / Total words
-
Standard metric for speech recognition
🧩 SECTION 4: NLP ALGORITHMS & TAGGING
Q5. Best Cut algorithm works for:
Extracted options:
-
Text classification
-
Coreference resolution
-
POS tagging
✅ Correct Answer
✔ Text classification
💡 Hint
-
Best Cut / Min Cut → graph-based partitioning
-
Often used for document clustering / classification
Q6. BIO tagging is applicable for:
Extracted options:
-
Text classification
-
Coreference resolution
-
NER
-
N-grams
✅ Correct Answer
✔ NER (Named Entity Recognition)
💡 Hint
-
BIO = Begin – Inside – Outside
-
Used in sequence labeling tasks
-
Especially common in NER and chunking
🎯 SECTION 5: ATTENTION MECHANISMS
Q7. Which among the following are attention mechanisms generally used in neural network models?
Extracted options:
-
Bahdanau attention
-
Karpathy attention
-
Luong attention
-
ReLU attention
-
Sigmoid attention
✅ Correct Answers
✔ Bahdanau attention
✔ Luong attention
❌ Incorrect
-
Karpathy (not an attention mechanism)
-
ReLU / Sigmoid (activation functions, not attention)
💡 Hint
-
Bahdanau = additive attention
-
Luong = multiplicative (dot-product) attention
-
Activations ≠ attention
📚 SECTION 6: TEXT PROCESSING & TOPIC MODELING
Q8. Porter Stemmer is used for:
✅ Correct Answer
✔ Stemming words to their root form
💡 Hint
-
Example: running → run
-
Reduces vocabulary size
-
Rule-based, not dictionary-based
Q9. LDA stands for:
✅ Correct Answer
✔ Latent Dirichlet Allocation
💡 Hint
-
Probabilistic topic modeling
-
Documents = mixture of topics
-
Topics = distribution over words
Q10. Which of the following are true to choose optimal number of topics in LDA for topic modeling?
Extracted options:
-
Min coherence value
-
Max model perplexity
-
Max coherence value
-
Min model perplexity
✅ Correct Answers
✔ Max coherence value
✔ Min model perplexity
💡 Hint
-
Coherence → interpretability (higher is better)
-
Perplexity → generalization (lower is better)
-
Best practice: balance both
No comments:
Post a Comment