In this paper, we propose a human activity recognition method from first-person videos, which provides a supplementary method to improve the recognition accuracy. Conventional methods detect objects and derive a user's behavior based on their taxonomy. One of the recent works has achieved accuracy improvement by determining key objects based on hand manipulation. However, such manipulation-based approach has a restriction on applicable scenes and object types because the user's hands don't always present significant information. In contrast, our proposed attention-based approach provides a solution to detect visually salient objects as key objects in a non-contact manner. Experimental results show that the proposed method classifies first-person actions more accurately than the previous method by 6.4 percentage points and its average accuracy reaches 43.3%.