Researchers at Salesforce have developed an algorithm that applies machine-learning techniques to accurately and coherently condense lengthy textual documents, technology which could impact fields such as law, medicine, and scientific research. The algorithm blends various strategies, including supervised learning, by being fed summary examples, while also applying an artificial attention mechanism to the text it is receiving and generating. The process ensures the system will not return too many repetitive strands of text, which has been an issue for other summarization programs. In addition, the system conducts experiments to produce its own summaries via reinforcement learning. Northwestern University professor Kristian Hammond lauds the Salesforce algorithm, but says it also illustrates the limits of solely relying on statistical machine learning. “We need a little bit of semantics and a little bit of syntactic knowledge in these systems in order for them to be fluid and fluent,” Hammond says.
More info here: Technology Review