Cyber Security Ontologies ![Conflict Detection Enabled Conflict Detection Enabled](/sites/all/themes/redux/css/images/icons/conflict_enabled_icon.png)
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
I wanted to start a new topic specific to cyber security ontologies. Over the past few years several research projects have developed fairly mature RDF/OWL ontologies for cyber security. Each research effort approached the development of cyber security ontologies with a different focus. The Science of Security ontologies to support the 7 core themes had a primary focus of cyber threat intelligence. The DARPA funded Integrated Cyber Analysis System (ICAS) ontologies had a primary focus of incident response. The CMU Insider Threat Indicator ontology funded by DARPA and FBI that was developed by CMU had an insider threat focus. Each of these ontologies had common pieces such as the use of the OASIS Structured Threat Information eXpression (STIX) and Cyber Observable Expression (CYBOX) languages. These common pieces would easily allow these multi-focused cyber security ontologies to be combined into a master modular Unified Cyber Ontology.
http://www.slideshare.net/shawnriley2/cscss-science-of-security-developing-scientific-foundations-for-the-operational-cybersecurity-ecosystem and https://github.com/daedafusion/cyber-ontology
http://stids.c4i.gmu.edu/papers/STIDS_2015_T06_BenSalem_Wacek.pdf and https://github.com/invincealabs/icas-ontology/tree/master/ontology
And http://resources.sei.cmu.edu/asset_files/TechnicalReport/2016_005_001_454627.pdf
These RDF/OWL ontologies for cyber security enable the advanced analytic methodology known as "object-based production" which is about organizing the knowledge to make it more useful. Additionally, because these ontologies are knowledge representation languages from the field of artificial intelligence they allow the development of logic based reasoning rules that can be developed to capture human analysis tradecraft and expert reasoning. Capturing this type of human expert reasoning would be extremely difficult and take significant time to attempt to do with areas like deep learning, if not impossible because of the highly complex cyber security domain.
I highly recommend considering research projects that would leverage these ontologies to continue to advance the field to fully realize the potential benefits these can have on modernizing and transforming analytic tradecraft and the next generation of cyber security analytic platforms that beyond what can be done with deep learning and similar data science areas.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Since many people are not familar with the key benefits that ontologies and how they enable object-based production for cyber security and cyber defense I wanted to provide a couple of bullets that help show the analytic benefits.
- Disparate cyber security data languages and formats are linked to rich modular cyber security ontology that federates the data in a common language allowing analysts to search across it without having to know each unique language/format/naming convention.
- Facilitates reasoning and gives analysts access to all the relevant cyber security data to reason over and allow human analyst reasoning tradecraft to be captured as reasoning rules which can scale with the technology and enable replication of analysis.
- Automates the analytic technique of hypothesis testing known as "Analytic Pivoting" because every data element or object is extracted and automatically exploited against all sources being feed into the solution. This provides the analysts with the required evidence to inform the hypothesis and also generate new hypotheses.
- Automated analytic pivoting can enable automated chaining of individual attacks into campaigns enabling automated attribution of new attacks to known campaigns based on evidence.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
UCO: A Unified Cybersecurity Ontology
Authors: Zareen Syed, Ankur Padia, M. Lisa Mathews, Tim Finin, and Anupam Joshi
Book Title: Proceedings of the AAAI Workshop on Artificial Intelligence for Cyber Security
Date: February 12, 2016
Abstract: In this paper we describe the Unified Cybersecurity Ontology (UCO) that is intended to support information integration and cyber situational awareness in cybersecurity systems. The ontology incorporates and integrates heterogeneous data and knowledge schemas from different cybersecurity systems and most commonly used cybersecurity standards for information sharing and exchange. The UCO ontology has also been mapped to a number of existing cybersecurity ontologies as well as concepts in the Linked Open Data cloud. Similar to DBpedia which serves as the core for general knowledge in Linked Open Data cloud, we envision UCO to serve as the core for cybersecurity domain, which would evolve and grow with the passage of time with additional cybersecurity data sets as they become available. We also present a prototype system and concrete use cases supported by the UCO ontology. To the best of our knowledge, this is the first cybersecurity ontology that has been mapped to general world ontologies to support broader and diverse security use cases. We compare the resulting ontology with previous efforts, discuss its strengths and limitations, and describe potential future work directions.
Type: InProceedings
Publisher: AAAI Press
Link to paper - http://ebiquity.umbc.edu/_file_directory_/papers/781.pdf
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
If someone was interested in connecting domain specific ontologies such as the cyber security ontologies discussed in this thread with wider open datasets and ontologies it might be worth using UMBEL. http://umbel.org/
UMBEL is a coherent general structure of 34 000 reference concepts which provides a scaffolding to link and interoperate other datasets and domain vocabularies. The conceptual structure is organized in a structure of 31 mostly disjoint SuperType. UMBEL is written in OWL 2 and SKOS.
The good news is that the UMBEL reference structure is already linked to 20ontologies used by different organizations to define their data sources:
DBPedia Ontology - Links between the DBpedia Ontology classes and the UMBEL Reference Concepts. Half of them comes from the linkage between Proton and UMBEL, and half the others come from hand mapping
Geonames - Geonames
Opencyc - OpenCyc Ontology
Schema.org - Schema.org ontology defines entities known by Google and other search engines
Wikipedia - Links between the Wikipedia pages and the UMBEL Reference Concepts
DOAP - DOAP(Description of a Project) is a vocabulary for project description.
ORG - The ORG (Core Organization) Ontology is a vocabulary for describing organizational structures for a broad variety of types of organization
OO - OO(Open Organizations) is a vocabulary providing supplementary terms for organizations that wish to publish open data about themselves
TRANSIT - TRANSIT(Transit) is a vocabulary for describing transit systems and routes
TIME - The TIME(Time Ontology) defines temporal entities
BIBO - BIBO (Bibliographic Ontology)
CC - CC (CreativeCommons Ontology)
Event - Event Ontology
FOAF - FOAF (Friend Of A Friend Ontology) used to describe people and organizations
GEO - WSG84 Geographic Ontology
MO - MO (Music Ontology)
PO - PO (Programmes Ontology)
RSS - RSS (Really Simple Syndication Ontology)
SIOC - SIOC (Semantically-Interlinked Online Communities Ontology)
FRBR - FRBR (Functional Requirements for Bibliographic Records)
According to Linked Open Vocabularies (LOV) service, the UMBEL reference structure, along with these 20 ontologies linkage would enable you to reach 504datasets tracked by LOV.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Those of you who have read my Science of Security paper will know that we presented a Semantic eScience stack and shared 100+ modular ontologies for the cyber security domain to provide a solid foundation to enable cognitive computing and to fully automate analytic pivoting across the data. I thought this article might be of interest to those who are also working on applying cognitive computing to security science.
AI and cognitive computing: how to distinguish the real value proposition
The real value proposition of cognitive computing is embracing ontologies (domain of concepts) based upon open specifications as part of any cognitive solution. Government, academia and, most notably, the healthcare industry are actively embracing open standards ontology based concepts through the World Wide Web Consortium (W3C) Web Ontology Language (OWL) specifications. The Object Management Group (OMG), in conjunction with Enterprise Data Management Council (EDMC) has developed a Financial Industry Business Ontology (FIBO) based upon OWL in response to achieving data standards and model driven approach to regulatory compliance. Furthermore, the World Health Organization (WHO) which publishes the International Classification of Diseases (ICD) will include OWL in their next release (ICD 11.)
Think of the competitive advantage that you can gain by delivering smart applications that can learn of the business user's intent, and yet are flexible enough not to lock you into a vendor's proprietary solution. Now you have achieved a true value proposition that provides you with a competitive advantage that puts your line of business out of reach from your competition.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Here is a link to the CMU/CERT Insider Threat Indicator Ontology OWL file
http://resources.sei.cmu.edu/asset_files/TechnicalReport/2016_005_112_465537.owl
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
OWL Ontology File for the Unified Cybersecurity Ontology (UCO) available here:
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Sep 30, 2016
NISTIR 8138
DRAFT Vulnerability Description Ontology (VDO): a Framework for Characterizing Vulnerabilities
NISTIR 8138 aims to describe a more effective and efficient methodology for characterizing vulnerabilities found in various forms of software and hardware implementations including but not limited to information technology systems, industrial control systems or medical devices to assist in the vulnerability management process. The primary goal of the described methodology is to enable automated analysis using metrics such as the Common Vulnerability Scoring System (CVSS). Additional goals include establishing a baseline of the minimum information needed to properly inform the vulnerability management process, and facilitating the sharing of vulnerability information across language barriers.
http://csrc.nist.gov/publications/PubsDrafts.html#NIST-IR-8138
![sark7's picture sark7's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-13753.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Thank you - an excellent and timely topic - greatly appreciative for the previous references. Some related research sources, of possible interest, to contribute:
- Ontological Approach toward Cybersecurity in Cloud Computing: https://arxiv.org/pdf/1405.6169.pdf
- Mission Impact of Cyber Events: Scenarios and Ontology to Express the Relationships Between Cyber Assets, Missions and Users: http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA517410
- The Essential Features of an Ontology for Cyberwarfare: http://www.crcnetbase.com/doi/abs/10.1201/b15253-7
- Ontological Representation of Networks for IDS in Cyber-Physical Systems: http://rd.springer.com/chapter/10.1007/978-3-319-26123-2_40
- Modeling Cyber-Physical Systems: https://www.researchgate.net/publication/220473317_Modeling_Cyber-Physical_Systems
- Cyber Defense and Situational Awareness: Inference and Ontologies (book chapter): http://www.springer.com/us/book/9783319113906
![sark7's picture sark7's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-13753.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Am quite interested in sharing with others concerning the supporting technologies associated with storing, managing, and querying cyber ontologies, particularly as associated with 'big data' implementations.
This appears to be a fast-moving space, so interested to hear the recommendations of others. Some technologies promoted only a few years ago have disappeared and new ones have rapidly grown. Some related thoughts / notes / links:
> RDF / graph / triplestore databases: not mutually exclusive, but some graph DBs are not RDF compliant and some triplestores are less friendly to looser specifications and are storage and computationally demanding, so there are implementation and performance considerations for each approach:
Example storage technologies:
- Apache Jena (RDF): https://jena.apache.org/
- Apache Spark GraphX (graphs): http://spark.apache.org/graphx/
- Neo4J (graph DB): https://neo4j.com/ - storing and querying RDF in Neo4J: http://www.snee.com/bobdc.blog/2014/01/storing-and-querying-rdf-in-ne.html
- CumulusRDF: https://www.w3.org/2001/sw/wiki/CumulusRDF
References / research / posts:
- Lengthily listing of triplestore DBs: https://www.w3.org/2001/sw/wiki/Category:Triple_Store
- Related blog post on RDF databases: http://blog.datagraph.org/2010/04/rdf-nosql-diff
- Research article evaluating performance of several implementations - 'NOSQL Databases for RDF' (2013): http://ribs.csres.utexas.edu/nosqlrdf/nosqlrdf_iswc2013.pdf
> Concerning querying / retrieval, my understanding is that SPRQL is recommended:
- SPARQL: RDF query language: https://en.wikipedia.org/wiki/SPARQL
> Concerning structuring / maintaining / editing / managing ontologies: considerations tie to the related storage implementation, but in general I understand one quickly faces challenges in managing complexity even in seemingly simple ontologies. Editing by hand purely in WordPad seems to be an unhappy approach. A number of tools and editors, both open source and commercial, are available. Interested in hearing about others, but have worked with these:
- Cognitum FluentEditor: http://www.cognitum.eu/semantics/FluentEditor/
- Protege: http://protege.stanford.edu/
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
I think an important consideration is which type of graph best fits the needs and requirements of the project or system. I thought this article did a good job of highlighting some differences.
While you can use a property graph like NEO4J or GRAPHX to load and query RDF, you lose all the reasoning and inference that description logic brings. Additionally, you move from standards based W3C languages to proprietary languages that haven't yet matured into standardized languages.
My first couple years I was really focused on fusion of threat and security data from multiple sources but since 2014 I've realized that leveraging description logic could be quiet powerful for automating prescriptive analytics.
Over at DarkLight we've been focusing on making sure the description logic is sound for the hundreds of cyber security ontologies we are using because we want to use both knowledge representation and reasoning to provide AI based virtual analysts that can execute prescriptive analytics based on the logic in the ontologies.
This market update has good information on triple stores and graph databases.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
I found that not many people are aware that ontologies are one of the key enables of AI, Robots and Automation, and IoT interoperability. You can follow my work with DarkLight to learn more about ontologies being used to create AI-based virtual analysts but you might also want to check out the following items on robotics and IoT.
https://www.nist.gov/news-events/news/2015/05/standard-knowledge-robots
http://standards.ieee.org/develop/wg/ORA.html
https://www.researchgate.net/publication/307122744_Semantic_Interoperability_for_the_Web_of_Things
The intersection of AI, Robots, and the IoT is starting to heat up and I would expect continued development and progress in the area of knowledge representation and reasoning in these areas as we move forward.
![sark7's picture sark7's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-13753.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Thank you for the additional feedback and references. The market update / technical analyst kit (http://stardog.com/tak.pdf ) indeed seems to reinforce that this is a fast moving space with several technologies beginning to overlap (RDF DBs, operational graph DBs, analytic graph DBs).
Understood and agree that full RDF compliance is essential to realize the extensive power of robust semantic storage and retrieval.
Found the following Springer book to be of help in charting a set of tools for semantic engineering: 'A Developers Guide to the Semantic Web' http://www.springer.com/br/book/9783662437957 . Chapter 14 outlines a set of recommended tools for semantic engineering development embracing full RDF compliance:
- Jena: Java-based web application development platform (including OWL reasoner)
- Sesame: Java framework for RDF storage and querying
- Virtuoso: DB engine combining RDBM, RDF, OWL, XML support (mentioned in market update)
- Redland: C libraries for RDF support
- Pellet: OWL 2 reasoner for Java
- RacerPro: OWL reasoner and inference server
- Protege: leading environment for ontology development
- NeOn Toolkit: ontology engineering environment with OWL 2 support
- TopBraid Composer: ontology creation/management via visual modeling environment
The W3C Semantic Web wiki is recommended for keeping up with the latest releases and emerging tools: https://www.w3.org/2001/sw/wiki/Tools
Concerning 'big data' support (RDF in Hadoop environments), I understand there is no standard approach and indeed that requisite tools and technologies are still coming into place.
It appears that Apache SPARK as a graph store and SPARQL for retrieval is dominant, although SPARQL does not natively support full RDF. However, it appears that there are emerging approaches involving hybridizing technologies (i.e. combining Spark/SPARQL, RDBM, and RDF), as in this article: http://www.vldb.org/pvldb/vol9/p804-schaetzle.pdf , although the state-of-the-art is implementation intensive.
Thus, it appears there is a way to go before there is true native Hadoop support for RDF storage and retrieval, although one can realistically synthesize particular use case requirements by hybridizing sets of existing technologies and tools...
![knowlengr's picture knowlengr's picture](/sites/default/files/pictures/genericphoto.png)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Am interested in this topic - especially FIBO / Cybersecurity as related to TAXII etc. Any updates from these contributors? Cheers.
![spriley's picture spriley's picture](http://archive.cps-vo.org/sites/default/files/pictures/picture-9495.jpg)
![Established Community Member Established Community Member](/sites/default/files/badges/badge_trusted.png)
Hi Mark-
We've abandoned the SoS-VO due to the SoS-VO not supporting or being inclusive of industry/projects outside of the core hard problems for the academic coommunity. This has forced us to move the SoS efforts for scientific knowledge manangement and semantic interoperability to other forums and efforts like Integrated Cyber and Intgrated Adaptive Cyber Defense (IACD) where knowledge representation and reasoning and the resulting expert system cognitive decision-making models have been identified as required. I'd start by joining the IACD community who are more interested in SoS scientific knowledge management than the SoS-VO has ever been.
I forgot to mention that we know the DoD Cyber Crime Center and organizations like Mitre are developing a Digital Forensics ontology based on DFAX. You can see pieces of this ontology on slides 11-13 in this slide deck.
https://www.dfrws.org/2015eu/proceedings/DFRWS-EU-2015-11p.pdf
This will fill in yet another piece of the unifed cyber ontology for cyber security and cyber defense.