In the paper on CONTRAfold,
What does mean? How does it control the balance between the sensitivity and specificity?
In the paper on RAF,
Why did the authors do max-margin training? (describe advantages.)
Summarize how they learned the proposed model.