Förra årets projektförslag (Last year's project proposals)

-----------------------------------------------------------------

Title: Multi-Agent Strategic Planning

Theme: Multi-Agent Systems, Games on Graphs, Knowledge-Based Strategies

Subject: In a multi-player game, a coalition of players is attempting to achieve a given objective within a (potentially hostile) environment, considered to be the opponent. Solving such a game means to find a strategy that achieves the objective regardless of the moves of the environment. Rescue missions involving robots and humans, or pursuit-evasion games are examples of such games, often called multi-agent systems.

An interesting, but complicating circumstance is when the players have limited knowledge about the current state of affairs, say due to limited observation capabilities. Such games are called games with imperfect information. A related aspect is posed by the communication capabilities between players. The problem of strategy synthesis under imperfect information and limited communication is known to be hard, and is an active research area.

This group of projects investigates the modelling of such games, as well as algorithmic techniques for strategy synthesis. In particular, the project focuses on strategies based on the notion of knowledge. In the context of this project, knowledge refers to information, structured suitably, stored and updated during the course of a play, for deciding on a course of action.

Inspirational Reading:

[1] Doyen, L., Raskin, J.F.: Games with imperfect information: Theory and algorithms. Lectures in Game Theory for Computer Scientists pp. 185–212, 2011.

[2] Berwanger, D., Kaiser, L., Puchala, B.: A perfect-information construction for coordination in games. In: Foundations of Software Technology and Theoretical Computer Science (FSTTCS’11). LIPIcs, vol. 13, pp. 387–398, 2011.

[3] Huang, X., van der Meyden, R.: Synthesizing strategies for epistemic goals by epistemic model checking: An application to pursuit evasion games. In: Proceedings of AAAI 2012, 2012.

Supervisor: Dilian Gurov

=================================================================

Title: Combining Deductive Verification with Model Checking

Theme: Program Verification

Subject: Program correctness is always stated relative to some description of what a program is supposed to do. This description is called a specification, and is typically expressed in some (program) logic. Then, to verify a program means to prove that it meets its specification. There are two main aspects of program behaviour (i.e., what they do): how they transform data, and what state sequences they incur. The first aspect is traditionally specified in Hoare logic (by writing contracts for each procedure in terms of preconditions and postconditions), while the second in Temporal logic. Unfortunately, even though related, these two aspects have been considered only separately by the research community, and practically no tool for checking temporal properties makes any use of data transformation properties that might have already been verified with Hoare logic.

In this group of projects, students can investigate the connection between transformational and temporal properties, and can propose an approach for checking temporal properties relative to procedure contracts (or statement-block contracts) expressed in Hoare logic. Apart from a theoretical development, the project can also result in an adaptation of some existing model checker to use Hoare logic contracts that have been verified by some tool for deductive software verification.

Inspirational Reading:

[1] Leslie Lamport: The Temporal Logic of Actions. ACM Transactions on Programming Languages and Systems (TOPLAS), vol 16, no. 3, pp. 872-923, 1994.

[2] Rajeev Alur, Swarat Chaudhuri: Temporal Reasoning for Procedural Programs. Verification, Model Checking, and Abstract Interpretation (VMCAI 2010), pp. 45-60, 2010.

[3] Siavash Soleimanifard, Dilian Gurov: Algorithmic Verification of Procedural Programs in the Presence of Code Variability. Science of Computer Programming, vol. 127, pp. 76-102, 2016.

Supervisor: Dilian Gurov

=================================================================

Title: Contracts for Software Development

Theme: Program Design and Specification

Subject: The ever increasing complexity of electronic and software systems is accompanied by increasingly high demands on functionality, correctness, and time-to-market. Furthermore, modern systems are typically built out of heterogeneous components, complicating even further the systems development and integration process. To handle these challenges, vertical processes, based on layered designs and utilizing abstraction/refinement techniques need to be combined with horizontal ones, based on component-based designs and supported by composition/decomposition techniques. A critical aspect of using this approach is how to delegate the responsibilities along layers and between components, and to make explicit the typically hidden assumptions about the boundary conditions under which the components will fulfill their tasks. Software contracts are ideal tools to solidify both vertical and horizontal processes providing the theoretical background to support formal methods in systems design.

This group of projects aims to develop a uniform framework, algorithms and tools for the specification and verification of software components and systems based in contracts. The main research questions are: (i) what constitutes a contract for a software component, (ii) what aspects of a software component should its contract cover, (iii) what notion of software architecture is needed, (iv) how can contracts be composed and decomposed, (v) how should procedures be treated, (vi) what are suitable languages for expressing contracts, and (vii) how can contracts be checked algorithmically?

Inspirational Reading:

[1] B. Meyer: Object-Oriented Software Construction. Prentice-Hall, Englewood Cliffs, second edition, 1997.

[2] A. Benveniste et al: Contracts for system design. Foundations and Trends in Electronic Design Automation, vol. 12, no. 2–3, pp. 124–400, 2018.

Supervisor: Dilian Gurov

-----------------------------------------------------------------

Title: Brain-inspired (biomimetic) computing

Theme: Algorithms, Connectionist systems, Networks

Subject: Current computational methods, algorithms and available software turn computers into machines particularly effective in bookkeeping, solving complex but well-defined problems, searching for specific patterns etc. However, today’s computers perform rather poorly in tasks where multi-modal perception allowing to identify complex undefined patterns in large data streams is needed, or common sense reasoning and handling ambiguity are required among others at the cost of precision or speed to effectively solve a range of real-world problems and meet growing demands for robustness, flexibility, adaptation as well as computational scalability (for example, big data challenge). Brain- or neural-inspired computing is an emerging field of research that aims to design such efficient algorithms based on the principles used by the nervous system with the brain in the first place to process generic information. In the family of neural network based (connectionist) systems the focus is on the biomimetic nature of the network architectures and learning mechanisms. Some of these connectionist methods devised in the realm of brain-inspired computing achieve human-competitive performance in recognition tasks.

In this project, students are first obliged to familarise themselves with the state-of-the-art methods in the emerging field of computational methods inspired by brain’s neural networks (connectionist approach) or more general cognitive architectures. Using this background information, students will be in a position to devise their own method or build upon the existing techniques to address a selected computational problem. A range of potential applications is wide as it involves among others problems with high-dimensional data (multivariate) having complex relationships, requiring explorative search for interesting multi-dimensional patterns with potentially hierarchical structure (low-level features that serve as building blocks for high-level data representations) and a possibility to perform a classification task or make inference. Some examples of broad applications are image analysis (or generally pattern recognition), speech recognition, data mining (e.g. medical, financial, industrial), high-dimensional time-series prediction etc. In the course of the project, special attention should be paid to the scale of the developed computational algorithm, implementation challenges, modularisation and, most importantly, functionality (robustness to noisy conditions, flexibility, effective learning from environment., capability to handle unsupervised or semi-supervised learning scenarios etc.)

Supervisor: Pawel Herman

-----------------------------------------------------------------

Title: Computer-aided medical diagnostics

Theme: Artificial intelligence, Classification, Machine learning, Algorithms

Subject: Computer-aided diagnosis has been extensively validated in various medical domains, ranging from biomedical image or signal analysis to expert systems facilitating the process of decision making in clinical settings. Although the usefulness of computational approaches to medical diagnostics is beyond any doubt, there is still a lot of room for improvement to enhance the sensitivity and specificity of algorithms. The diagnostic problems are particularly challenging given the complexity as well as diversity of disease symptoms and pathological manifestations. In the computational domain, a diagnostic problem can often be formulated as a classification or inference task in the presence of multiple sources of uncertain or noisy information. This pattern recognition framework lies at the heart of medical diagnostics projects proposed here.

Below you can find a set of alternative projects (they can be treated individually or in combination).

Possible projects:

Define a diagnostic problem within the medical domain and examine the suitability of machine learning, connectionist (artificial network-based), statistical or soft computing methods to your problem.
Survey the state-of-the-art in computational tools supporting classification of disease symptoms and comparatively examine the diagnostic performance of some of them on a wide range of available benchmark data sets. Define a measure for diagnostic performance.
Identify and address some of the urgent challenges for computer-assisted diagnostics in medicine.

Supervisor: TBD

-----------------------------------------------------------------

Title: Automated scheduling, e.g. university timetabling

Theme: Artificial intelligence, Machine learning, Algorithms, Optimisation

Subject: Planning is one of the key aspects of our private and professional life. Whereas planning our own daily activities is manageable, scheduling in large multi-agent systems with considerable amounts of resources to be allocated in time and space subject to multitude of constraints is a truly daunting task. In consequence, scheduling or timetabling as prime representatives of hard combinatorial problems have increasingly become addressed algorithmically with the use of computational power of today's computers. This computer-assisted practice in setting up timetables for courses, students and lecturers has also gained a lot of interest at universities around the world and still constitutes an active research topic.

In this project, students can address a scheduling problem of their own choice or they can use available university timetabling benchmark data and tailor it to the project's needs. An important aspect of such project would be to select or compare different algorithms for combinatorial optimisation, and define a multi-criterion optimisation objective. It could be an opportunity to test computational intelligence and machine learning methodology.

Supervisor: TBD

---------------------------------------------------------------------------------------------

Title: Intelligent control systems

Theme: Machine learning, Artificial intelligence, Algorithms, Control, Soft computing

Subject: There is a clear trend for smarter machines that are able to collect data, learn, recognize objects, draw conclusions and perform behaviors to emerge in our daily life. Advanced intelligent control systems affect many aspects of human activities and can be found in a wide range of industries, e.g. healthcare, automotive, rail, energy, finance, urbanization and consumer electronics among others. By adapting and emulating certain aspects of biological intelligence this new generation of control approaches makes it possible for us to address newly emerging challenges and needs, build large-scale applications and integrate systems, implement complex solutions and meet growing demand for safety, security and energy efficiency.

Possible essay projects:

Select a real-world control problem (traffic control, energy management, helicopter or ship steering, industrial plant control, financial decision support and many others) and propose a new approach using machine learning and soft computing methodology (computational intelligence) that enhances functionality, automatisation and robustness when compared to classical solutions.
Demonstrate functional (and other) benefits of “computationally intelligent” control approaches in relation to the classical methodology in a range of low-scale control problems (benchmarks). Discuss a suitable framework of comparison and potential criteria.
Consider a control robotic application with all constraints associated with autonomous agents and real-world environments (which can be emulated in software). Propose “computationally intelligent” methods to enable your robot agent prototype to robustly perform complex tasks (learn from the environment, evolve over time, find solutions to new emerging problems and adapt to new conditions among others).

Supervisor: TBD

-------------------------------------------------------------------------------

Title: Quality assessment of the output of an ML (deep learning) network

Theme: Algorithms, Machine learning, Classification, Signals

Subject: In machine learning, using artificial neural networks and deep networks, a fundamental criticism confers to the problem of assigning an estimate of quality to the answer. Therefore, it is generally hard to understand how the result of the net could be interpreted. Approaches either formulating the network activities in terms of statistical (Bayesian) estimates or assessing how much a single pixel contributes to the generated output have been proposed. This project aims at studying the properties and utility of methods to assess the quality of the output of networks.

Below you can find a set of alternative projects (they can be treated individually or in combination).

Possible essay projects:

Develop your own approach or build upon the existing approaches to a specific aspect of network output evaluation, e.g.
- Discuss the different types of evaluation principles that can be used (e.g. sensitivity analysis, saliency analysis, layer-wise relevance propagation).
- Discuss potential problems if quality of output is not addressed.
Alternatively, select and compare a few existing state-of-the-art methods. Focus on selected aspects of a network evaluation problem of your choice
Discuss key challenges, emerging trends and propose future applications for network output evaluation methodology.

Supervisor: Erik Fransén

-------------------------------------------------------------------------------

Title: Online learning

Theme: Algorithms, Machine learning, Classification, Signals

Subject: One of the major problems in continuous online learning is that previously learned information gets forgotten when new information is learned. This project aims at studying some of the recent proposals to overcome this problem.

Below you can find a set of alternative projects (they can be treated individually or in combination).

Possible essay projects:

Develop your own approach or build upon the existing approaches to a specific online learning problem, e.g.
- Discuss what different aspects of online learning problems there are.
- Discuss advantages and disadvantages of one particular online learning algorithm.
Alternatively, select and compare a few existing state-of-the-art methods. Focus on selected aspects of an online learning problem problem of your choice.

Supervisor: Erik Fransén

-------------------------------------------------------------------------------

Title: Data-driven construction of dynamical ODE-models

Theme: Algorithms, Dynamic simulation, ODE-models

Subject: Data-driven dynamic modeling and simulation confers to the (automatic) construction of models from data. This approach is currently developed to construct dynamical ODE-based models from data, typically static data obtained from chemical, biological or physical experiments. This project aims at studying some of the recent proposals to automatically construct dynamical ODE-models from data.

Below you can find a set of alternative projects (they can be treated individually or in combination).

Possible essay projects:

Develop your own approach or build upon the existing approaches to a specific ODE-model construction problem, e.g.
- Discuss and analyze types of ODE-systems that are feasible to construct this way
- Discuss aspects of the automatic construction (parameter estimation) that are particularly hard to achieve or important to do.
Alternatively, select and compare a few existing state-of-the-art methods. Focus on selected aspects of a ODE-model construction problem of your choice
Discuss key challenges, emerging trends and propose future applications for automatic ODE-model construction methodology.

Supervisor: Erik Fransén

-------------------------------------------------------------------------------

Title: Real-time virtual crowds: Analysis, simulation and evaluation in Unity 3D

Theme: Multi-agent systems, 2D visualisation, 3D graphics, animation

Crowds of realistically-behaving synthetic characters have many application domains, from special effects for entertainment (see for example, World War Z), where they replace expensive extras and stunt performers, to evacuation and traffic simulations, where the safety and feasibility of designs can be tested and modified prior to the costly construction of real environments.

Key challenges in this area relate to understanding the movement patterns in real crowd data, creating realistic or plausible simulations of crowd behaviour, rendering the geometry of large numbers of characters in real-time and evaluating simulation algorithms with each other and real data.

See [1] for an extensive set of examples of previous student work in this area (including previous DD142x Kexjobb projects - see 'Computer science' section). These projects will be based on Unity 3D, a powerful beginner-friendly 3D game engine, and a 2D crowd visualisation engine being developed at ESAL, KTH [1].

Below you can find a set of alternative projects (they can be treated individually or in combination).

Possible essay projects:

Analysis and/or 2D visualisation of real or simulated crowd data to highlight patterns in crowd movement. Methods include implementing and evaluating metrics to determine the similarity between trajectories and/or clustering similar trajectories together to determine flow lanes [2].
3D rendering of virtual characters in the Unity 3D engine as imposters, including performance comparisons with brute-force approaches. Imposter methods render the complicated geometry of virtual characters in a much simpler way, with minimum visual differences to the end-user. Imposters can also be dynamically lit and shaded.
Implementation/comparison of crowd simulation methods in Unity 3D enabling collision avoidance at multiple different crowd densities [4]. Flow rate, energy usage and collision free simulation in different environment scenarios are important aspects of this work. Comparison with other state-of-the-art methods in a variety of circumstances.

References:

[1] Student projects, Embodied Social Agents Lab (ESAL), https://www.csc.kth.se/~chpeters/ESAL/studentprojects.htmlLänkar till en externa sida.

[2] J-G Lee, J. Han and K-Y Whang, Trajectory Clustering: A Partition-and-Group Framework, In Proceedings of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD '07). ACM, New York, NY, USA, 593-604.

[3] E. Millan and I. Rudomín. Impostors and pseudo-instancing for GPU crowd rendering. In Proceedings of the 4th international conference on Computer graphics and interactive techniques (GRAPHITE). ACM, 2006

[4] R. Narain, A. Golas, S. Curtis, and M.C. Lin. Aggregate Dynamics for Dense Crowd Simulation. In Proceedings of ACM Transactions on Graphics (SIGGRAPH Asia), 2009

Supervisor: Christopher Peters

-------------------------------------------------------------------------------

Title: Cyber security

Theme: Cyber security

Subject: With the general digitalization of our society by IoT, cloud, and AI, immensely complex IT-infrastructures are being formed. Obviously, ensuring that these infrastructures are resilient to cyber attacks is vital for the well being of our society. However, only to overlook this environment is challenging not to mention the understanding and assessing the cyber security posture of it.

Attack simulations may be used to assess the cyber security of complex systems. In such simulations, the steps taken by an attacker in order to compromise sensitive system assets are traced and documented. Attack graphs constitute a suitable formalism for the modeling of attack steps and their dependencies, allowing the subsequent simulation. The Meta Attack Language (MAL) has been proposed for the design of domain-specific attack languages. MAL provides a formalism that allows the semi-automated generation as well as the efficient computation of very large attack graphs.

This proposal contains the opportunity for multi projects of various types, including but not limited to:
• Mitre Enterprise ATT&CK matrix in MAL
• Modeling of a Cyber Attack on an Industrial Control System
• Estimating human resilience to social engineering attacks through computer configuration data
• Automatic generation of cyber attack models in critical infrastructures
• Enriching Attack Models with Threat Actor Information
• Adding forensic evidence in threat models
• Influences of Attack Simulation Outcomes on the Business
• Measuring Coverage Criteria for Attack Simulations
• A Development Environment for Creating MAL Instances
• Enriching Threat Models by Environmental Information
• Ethical hacking / penetration testing of IoT devices

Supervisor: Robert Lagerström (robertl@kth.se)

-------------------------------------------------------------------------------

Title: Automatically Generating Questions

Theme: Natural Language Processing, Question-based Learning, Education

Subject: Advances in the understanding teaching and learning point towards the benefits of using regular formative assessment (informal quizzes and exercises) over summative assessment (formal examination). However, it is time consuming and laborious to manually generate and regularly update activities for students. Alternative approaches, such as involving students in the generation of activities are promising, but raise problems in terms of quality control, completeness and bias towards preferred topics. This project will investigate the potential of automatically generating activities, such as multiple choice questions, that are indistinguishable from human-generated questions, within the domain of introductory programming.

Supervisor: Ric Glassey

-------------------------------------------------------------------------------
Title: Repo Mining

Theme: Repository Mining, Software Quality, Analytics

Subject: Open source software repositories capture much more than source code. Entire project life cycles are captured in numerous commits, issues, and conversations made by developers as they work. This data is automatically recorded within popular platforms like Github.com at no extra cost. However, little use is made of this data after a project has moved on in its lifecycle. The availability of public APIs has made accessing this rich historical data trivial, and creates the opportunity to mine and analyze software projects in greater detail than ever before. By fusing the multiple sources of data together, it is possible to develop a deeper understanding of the dynamic forces present in the process of software engineering.

Supervisor: Ric Glassey

-------------------------------------------------------------------------------

Title: Classification of Particle Trajectories during Magnetic Reconnection

Theme: High-Performance Computing (HPC), Machine Learning

Subject: Magnetic reconnection is an important phenomenon occurring in space and astrophysical systems. Magnetic reconnection leads to the conversion of magnetic field energy to kinetic energy, accelerating electrons and protons. As starting point of this project, we provide a set of electron positions and velocities that have been obtained by running the iPIC3D Particle-in-Cell code for simulating magnetic reconnection. The goal of this project is to explore the given dataset with Machine Learning techniques and investigate the possibility of categorizing different particle trajectories during the magnetic reconnection.

References

Markidis, Stefano, Giovanni Lapenta and Rizwan-uddin. "Multi-scale simulations of plasma with iPIC3D." Mathematics and Computers in Simulation80.7 (2010): 1509-1519.
Peng, Ivy Bo, et al. "Energetic particles in magnetotail reconnection." Journal of Plasma Physics81.2 (2015).

Supervisor: Stefano Markidis

-------------------------------------------------------------------------------

Title: Assessing Precision Improvement with Emerging Floating Point Formats

Theme: Floating Point Representations

Subject: Recently new floating-point formats have been proposed to address some of the limitations of the IEEE 754 floating point format. Among the most promising new formats, the are the Posit, bfloat and flexpoint. The goal of this project is to assess the precision improvement or loss of one of these new formats with respect to the IEEE 754 format. The project includes the development of a benchmark code using IEEE 754 floating point format and one of the emerging floating point formats, e.g. Posit. Possible benchmark codes could include matrix-matrix multiplication, convolution and n-body simulator.

Reference

Chien, Steven WD, Ivy B. Peng, and Stefano Markidis. "Posit NPB: Assessing the Precision Improvement in HPC Scientific Applications." arXiv preprint arXiv:1907.05917 (2019).

Supervisor: Stefano Markidis

-------------------------------------------------------------------------------

Title: Evaluating the Performance of TensorFlow as General Purpose Programming Framework

Theme: Tensor Flow

Subject: Tensorflow is a framework initially designed for deep-learning workloads possibly running on GPUs. The goal of this project is to evaluate the performance of general-purpose benchmark codes developed in Tensorflow and compare their performance with the performance of other approaches. The code performance can be tested on systems with CPU and GPU. Possible benchmark codes could include matmul, convolution, and FFT. Alternative approaches for CPU are numpy, and pyculib and numba for GPU programming.

Reference

Chien, Steven WD, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, and Jeffrey S. Vetter. "TensorFlow Doing HPC." arXiv preprint arXiv:1903.04364 (2019).

Supervisor: Stefano Markidis

-------------------------------------------------------------------------------

Title: Neuro-inspired event-based Computer Vision Algorithms

Subject: Novel computer vision systems that are inspired by how human eyes work show significant advantages in terms of high-speed sensing and reduced processing power requirements. However, the fundamentally different data format (events rather than images) also requires complete rethinking of image processing algorithms; much closer to how distributed neuronal networks operate.

In the project domain, your research will be in designing fundamentally novel algorithms for extracting a particular visual aspect, e.g. optic flow, scene understanding, distance (i.e. disparity), gesture recognition, object tracking, counting, or remote recognition. Your developed algorithm(s) shall be compared and contrasted to existing event-based algorithms and to conventional computer vision in terms of accuracy, latency, and computing demands. We will provide recorded data sets and also hardware + software infrastructure to conduct experiments.

More information about this research field: https://tinyurl.com/DD142x-ebv (Länkar till en externa sida.)

Supervisor: Jörg Conradt

-------------------------------------------------------------------------------

Title: Neuronal Models for real-time Robotic Motor Control

Subject: Motor control for complex (and possibly compliant) robotic actuators (such as factory robots, home/personal assistants, and also neuro-prosthetic devices) is a challenging task given limited computing resources. Often the system to be controlled is unknown or deviates significantly from design specification. Neuronal network (such as the human brain, the cerebellum, and the spinal cord) excel in the control of our body’s movement. In this project context, students shall investigate neuronal methods to learn the generation and/or execution of desired motor trajectories. We will provide several wheeled or legged robots and robotic arms, both in simulation and as real robots to perform real-time experiments. Students shall compare performance of neuronal algorithms and neuronal learning against traditional methods for robot control.

More information about this research field: https://tinyurl.com/DD142x-nmc (Länkar till en externa sida.)

Supervisor: Jörg Conradt

-------------------------------------------------------------------------------

Title: Large-scale real-time Spiking Neuronal Models on Neuromorphic Hardware

Subject: For applications in consumables, wearables, or mobile robotics, large scale neuronal networks need to operate in real-time under reasonable power budgets. The field of neuromorphic computing investigates software and hardware options to efficiently execute neuronal networks; often implemented as spiking systems in custom hardware. This research community has recently attracted substantial interest from traditional hardware manufacturers (such as IBM, Qualcomm, or Intel), but also from system and application providers (such as Bosch, BMW, Samsung). In this project area, the students shall explore spiking neuronal networks, identify a problem that can get addressed by a spiking neuronal network and implement the system on KTH’s existing neuromorphic hardware (SpiNNaker, Spikey, TrueNorth, or Loihi). Compare against traditional computing architectures in terms of achievable complexity and power consumption.

More information about this research field: https://tinyurl.com/DD142x-nmh (Länkar till en externa sida.)

Supervisor: Jörg Conradt

-------------------------------------------------------------------------------

Title: Spatial analysis of neuron populations

Theme: Neuroscience, Neural networks, Computer science

Subject: Neurons in the brain make networks by connecting to each other at the apposition points forming chemical or electrical synapses. In the large-scale simulations of realistic neural networks the appositions can be directly computed from the morphologies of individual neurons. The task is to suggest efficient and scalable algorithms of touch detection in populations of morphologically detailed neurons. Digitally reconstructed neurons are available from the public repositories [1, 2].

References:

[1] Neuromorpho.org (Länkar till en externa sida.)

[2] Computational Neurobiology and Imaging Center, http://research.mssm.edu/cnic/repository.html (Länkar till en externa sida.)

Supervisor: Alexander Kozlov

-------------------------------------------------------------------------------

Title: Efficient data structure for encoding neuronal morphology

Theme: Neuroscience, Cell biology, Computer science

Subject: Neurons have complex shape determined by a cell body (soma) and neurites (dendrites and axon). Computer simulations of nervous system often use realistic models of neurons which preserve realistic geometry of individual cells. Neuron morphology has a tree-like structure and can be fully specified in a very simple format [1, 2]. The task is to find efficient ways of representing morphological data in computer memory which would allow fast random and sequential access to neurites.

References:

[1] Cannon RC et al. (1998) An on-line archive of reconstructed hippocampal neurons. J Neurosci Methods 84(1-2):49-54.

[2] Neuromorpho.org (Länkar till en externa sida.)

Supervisor: Alexander Kozlov

-------------------------------------------------------------------------------

Title: Automatic classification of neurons by their morphology

Theme: Neuroscience, Cell biology, Machine learning

Subject: Neurons have tree-like shape (morphology) and differ by size, topology and other geometric properties depending on their role and place in the nervous system. Each morphological type exhibits characteristic set of geometric features. The task is to divide neurons to different morphological types using automatic procedure. Test candidate algorithms for accuracy and speed. Digitally reconstructed neurons are available from the public repositories [1, 2].

References:

[1] Neuromorpho.org (Länkar till en externa sida.)
[2] Computational Neurobiology and Imaging Center, http://research.mssm.edu/cnic/repository.html (Länkar till en externa sida.)

Supervisor: Alexander Kozlov

-------------------------------------------------------------------------------

Title: Creating the brain tissue

Theme: Neuroscience, Cell biology, Neural networks, Computer science

Subject: Neurons are tightly packed in the brain, with neuron density sometimes exceeding 100,000 neurons per cubic millimeter. This makes cell placement a non-trivial problem in the large-scale realistic simulations of neural networks. The task is to suggest and try algorithms of filling the structured volume with cells which somata (i.e. cell bodies only, ignoring thin multiple neurites) do not overlap. All cells have different sizes; neurons should be placed randomly with possible gradients for different cell types.

Supervisor: Alexander Kozlov

-------------------------------------------------------------------------------

Title: Brain slice analysis

Theme: Neuroscience, brain architecture, machine learning

Subject: High throughput neuroscience deals with large number of experimental data, for example, images of the brain slices where neurons are represented as fuzzy spots of higher intensity of a chemical signal. Automatic recognition of the neuron images with noisy background would speed up the digitalization of the brain. The task here would be to suggest and try out different methods of cell counting, from signal thresholding to machine learning including deep learning techniques.

References:

Allen Brain Atlas, https://portal.brain-map.org/ (Länkar till en externa sida.)

Supervisor: Alexander Kozlov

-------------------------------------------------------------------------------

Title: Improving the scientific peer-review process

Theme: Algorithms / Simulations

Subject: The principle of peer review is central to the evaluation of research, by ensuring that only high-quality scientific works are funded or published. But peer review has also received criticism, as the reviewers’ time and attention are limited (and unpaid) resources. Naturally, human bias also affects the review process. In 2014, the organizers of the premier machine learning conference “Neural Information Processing Systems” conducted an experiment in which 10% of submitted manuscripts (166 items) went through the review process twice. Arbitrariness was measured as the conditional probability for an accepted submission to get rejected if examined by the second committee. This number was equal to 60%, indicating

Possible essay project:

Learn what goes into a typical peer-review, and develop a method to simulate peer-review for a scientific conference (model the submissions & reviewers, include noise)
Think about ways you can improve the accuracy peer-review process without creating (too much) extra work for the reviewers. Hint: think about the ELO rating system used in chess and other pairwise comparison methods.

References:

Supervisor: Kevin Smith

-------------------------------------------------------------------------------

Title: User behavior prediction

Theme: Algorithms

Subject: How can we model users’ preferences? How can we use this data to predict their actions? By modeling user behavior using data such as their browsing history or their ratings of products/media, we can help users obtain needed information more directly. Netflix held an open competition from 2006 to 2009 for the best algorithm to predict user ratings for films based on previous ratings, and offered a $1,000,000 prize to the winners in 2009. Many other high-tech companies face similar problems, and databases are publicly available for many such problems.

Possible essay project:

Predict whether a user will like a movie based on previous ratings
Predict where a user will spend their next holiday based on their browsing history

References:

The Airbnb New User Bookings dataset (Länkar till en externa sida.)
The MovieLens dataset (Länkar till en externa sida.)

Supervisor: Kevin Smith

-------------------------------------------------------------------------------

Title: Credibility analysis in Twitter feeds using domain knowledge

Theme: Algorithms

Subject: Society, particularly the younger generation has become increasingly reliant on internet-based sources such as Twitter or Facebook to receive news and updates. Often, the information passed on is unvetted from non-traditional news sources, often individuals. Traditional journalists verify the information and sources before publishing, but online platforms often allow various false information and rumor to spread. Numerous examples can be cited following recent disasters. One possible way to cope with this problem is to develop automatic methods to assess the credibility of tweets by analyzing the language and the source identity. One potential bottleneck is the availability of annotated Twitter data to train and evaluate credibility models.

Possible essay project:

Develop a method to predict the trustworthiness of sources and the credibility of their tweets. Familiarize yourself with existing research on the topic, and consider methods in natural language processing
Think about alternative methods to collect the data necessary to build an automatic credibility assessment algorithm

References:

The Apollo Project, Twitter datasets (Länkar till en externa sida.)
Defining ground truth in information credibility on Twitter
Tweets2011 (Länkar till en externa sida.), a corpus of tweet identifiers

Supervisor: Kevin Smith

-------------------------------------------------------------------------------

Title: Character recognition in natural images

Theme: Algorithms

Subject: Character recognition is a classic pattern recognition problem. For the Latin script, character recognition is largely a solved problem given certain constraints (eg. images of a scanned document). However, character recognition in natural images (photographs) pose a much more difficult problem, where characters can be much more difficult to recognized (eg. characters in neon script outside of a restaurant).

Possible essay project:

Develop a method to recognize characters found in natural images

References:

The Chars74K dataset (Länkar till en externa sida.)
A paper on Character recognition in natural images (Länkar till en externa sida.)
The Julia (Länkar till en externa sida.)language
Tensorflow convolutional neural networks

Supervisor: Kevin Smith