Biblio
Web identifiers such as usernames, hashtags, and domain names serve important roles in online navigation, communication, and community building. Therefore the entities that choose such names must ensure that end-users are able to quickly and accurately enter them in applications. Uniqueness requirements, a desire for short strings, and an absence of delimiters often constrain this name selection process. To gain perspective on the speed and correctness of name entry, we crowdsource the typing of 51,000+ web identifiers. Surface level analysis reveals, for example, that typing speed is generally a linear function of identifier length. Examining keystroke dynamics at finer granularity proves more interesting. First, we identify features predictive of typing time/accuracy, finding: (1) the commonality of character bi-grams inside a name, and (2) the degree of ambiguity when tokenizing a name - to be most indicative. A machine-learning model built over 10 such features exhibits moderate predictive capability. Second, we evaluate our hypothesis that users subconsciously insert pauses in their typing cadence where text delimiters (e.g., spaces) would exist, if permitted. The data generally supports this claim, suggesting its application alongside algorithmic tokenization methods, and possibly in name suggestion frameworks.
During an advanced persistent threat (APT), an attacker group usually establish more than one C&C server and these C&C servers will change their domain names and corresponding IP addresses over time to be unseen by anti-virus software or intrusion prevention systems. For this reason, discovering and catching C&C sites becomes a big challenge in information security. Based on our observations and deductions, a malware tends to contain a fixed user agent string, and the connection behaviors generated by a malware is different from that by a benign service or a normal user. This paper proposed a new method comprising filtering and clustering methods to detect C&C servers with a relatively higher coverage rate. The experiments revealed that the proposed method can successfully detect C&C Servers, and the can provide an important clue for detecting APT.