Degree Project Proposals
Examiner: Mats Brorsson, professor
Computer
Architecture
Introduction
The projects described on
this page are suggestions for suitable degree projects at master level.
It is
mainly targeted towards students at the IT-, D-, E-programmes
(Civilingenjör) and students at the Embedded systems, System-on.chip design or Software engineering of Distrbuted Systems master
programmes,
who are interested in computer systems, computer architecture or parallel computer systems. For each
project, I list suitable background needed to carry out the
project and
if there is a possibility that the project could be sponsored
financially or carried out in industry.
The projects are listed
chronologically with the newest first. Projects that are not sponsored
financially are handed out to suitable students on a
first-come-first-serve
basis. Sponsored projects, if any, may have an application
procedure.
Most projects are carried
out in within the scope of the Multicore Center at SICS.
All thesis projects carried out here have the potential of
generating research publications and is therefore an excellent start if
you ever want to pursue a research carreer in the future.
Project proposals
Proposals with a link have
a more detailed description below. Students interested in the other
proposals
are welcome to discuss them directly with Prof. Mats Brorsson. Some
proposals
are just sketches and needs to be more
properly
defined. Others are more well defined.
Getting Barrelfish to run on the Tilera TilePro64 architecture
The TilePRO64 processor is a 64 core embedded processor
targeting media applications (
www.tilera.com).
Barrelfish
is a new research operating system trying to meet the challenges with
heterogeneous and manycore processors that we start seeing. The project
would continue an effort started to get Barrelfish to run on top of the
Tilera hardware.
The project tentatively
includes the following tasks:- Finishing the porting effort of Barrelfish
- Measuring and characterization of Barrelfish performance on simulator and hardware
Suitable for 1-2 (preferably 2) students interested in system software and parallel architectures.
Simulation environment for Barrelfish on Simics
Barrelfish
is a new research operating system trying to meet the challenges with
heterogeneous and manycore processors that we start seeing. It boots
relatively cleanly on standard PC hardware. It would be very valuable
to be able to boot it on a full system simulator such as Simics (
www.simics.net) as we would gain so much more observability and possiblity to test new hardware devices easily.
The project tentatively
includes the following tasks:- Developing
new device drivers (or new devices for which there is a working device
driver) for devices not yet supported in Barrelfish
- Development of Barrelfish shell tools (ps, top, time) useful for observing and controlling behaviou
- Performance study on a range of simulated architectures.
Porting Cilk Plus to Barrelfish
Intel
Cilk Plus
is a state-of-the-art programming model for manycore architectures. It
currently runs mostly on standard operating systems such as Linux and
Windows, but this project involved porting it to the Barrelfish
operating system so as to provide a higher abstraction level
programming model.
The project tentatively
includes the following tasks:Workload characterization of task-centric applications on manycore architectures
Task-centric parallel programming is rapidly gaining interest in favour of more traditional thread-centric models. The Barcelona OpenMP Task Suite (
BOTS)
is a collection of programs often used in studying the implementation
of OpenMP and other task-centric models. This projects aims to do an
architectural characterization of these benchmarks on two manycore
architectures (a 48-core AMD x86-64 system and the TilePRO64
architecture from Tilera).
Bare-metal task-centric run-time system on Tilera
The
TilePRO64 architecture runs Linux in normal mode. An OS inevitably
creates some overhead and there is a way to be able to run bare-metal
sotware on the TilePRO64 architecture without using an operating
system. This project aims to develop and evaluate a bare-metal run-time
system for executing task-centric programs on a partition of the
TilePRO64 architecture.
Hybrid static and dynamic scheduling of task-centric workloads.
Description: Current work-stealing algorithms uses work-stealing as a
scheduling technique to balance and distribute work across computer resources.
With the addition of pipelined task-parallelism, many of the applications now
have many static features. For example, the main loop will always spawn X
numbers of tasks on each iteration, and they all have the same dependencies
between each other. Looking at these static DAGs, can we do smarter scheduling
by combining static scheduling and dynamic scheduling?
User Controllable Caches - A simulation platform with controllable caches
Description: Currently architectures feature limited support for
placement of data. There is also little support for quering the cache for
statistics and knowledge about data. As we move toward a features where data
placement becomes much more expensive than the computation at hand, what
support do we require from the underlying hardware? This work would implement
such a controllable cache-based system, using existing simulation frameworks
such as e.g. gem5 or simics.
Un-Coherent Cache on the Tilera - A do or don't approach in terms of
performance and power
Description: The TilePRO64 processor is an 64 cores embedded processor
targeting media applications. Amongst other, it features a high speed
interconnect and customizable caches. Among the options to controlling the
caches is the ability to use non-coherent memory regions. But how are these
non-coherent regions used? What limitations do they incur and are the benefits
of using them higher than not using them? This projectw ould include porting a
run-time system (Nanos++, MIR, UnRT) and evaluating different workloads in
terms of Non-coherence and coherence for both performance and power.
A parallel application survey
Description: There is always the need for benchmarks. We currently do
not have enough benchmarks to work on, and the few we have do not properly
represent the real-world application. This project would involve a grand survey
of existing applications that can be parallelized. Programming models in focus
would be task-centric models and GPUs. Additionally, this project would involve
contacting various instutites and depaartments within KTH try find out what
problems THEY have, and if they can be parallizable?
Evaluation of power,
performance and scalability of cache coherence
mechanisms for an adaptive multicore
architecture
To put multiple processors
on one chip – multicore – is one way to
handle the design
complexity and diminishing returns of every generation of new computer
architectures. Each core in a multicore
processor has
typically private instruction and data caches. The data caches of the
different
processors are kept coherent by means of some cache coherence protocol.
Different proposed multicore architectures
have had
different ways of implementing this coherence and it is not clear which
one is
to prefer when performance, power consumption and scalability have to taken into account.
This project aims to
compare and evaluate at least two different ways of achieving coherence
in a multicore architecture.Some
example coherence methods are:
- Bus-based
snooping
- Centralized
snooping engine like the one used in the ARM MPCore
- Decentralized
directory-based coherence (simulation model exits)
The project tentatively
includes the following tasks:
- A
literature study on chip-multiprocessor architectures and suitable
cache coherence mechanisms
- A design
and implementation of two (or three) different cache coherence
mechanisms (including the interconnection medium) for a simulated CMP
- An
evaluation with respect to performance, power and scalability for the
two (three) coherence mechanisms using a simulated chip-multiprocessor
model
A suitable methodology is
to use the Simics/GEMS full system
infrastructure.
Performance, power and area models are preferably taken from the Cacti (http://research.compaq.com/wrl/people/jouppi/CACTI.html) models.
The project is suitable for
one or two students interested and with knowledge in parallel computer
architecture. Suitable courses for that at KTH are IS2202 (Computer
systems
architecture) and preferably also IS2200 (Parallel computer systems).
The thesis project can
start any time.
The standard way of
performing performance and power evaluations in computer architecture
is to use
high-level simulation where you run workload applications on a
simulated
architecture. State-of-the-art simulators include SimpleScalar
and Simics. These simulators are written
in
conventional programming languages and – although Simics
is a commercial product thus limiting what is exposed to the user –
they are
thus extremely flexible in what you can simulate. It is also quite
cumbersome
to develop new simulation models.
The Liberty Simulation
Environment, LSE, is a new simulation infrastructure for architectural
research
that is built on the concept of component reuse and using a
concurrent-structural methodology to describe hardware components. The
topic of
this MSc thesis project is to develop/adapt simulators of a specific
architecture using SimpleScalar, Simics
and LSE in order to make a comparison between these models in terms of expressivness, difficulty to use, performance
and modeling capabilities.
Since this
projects first incarnation, new simulation models have appeared.
A
survey of the state-of-the art in simulation technology is part of the
project.
The project tentatively
involves the following tasks:
- Selection
and definition of a target architecture to simulate
- Adaption
of Simics, SimpleScalar
to simulate the selected target architecture
- Developement of
an LSE-based simulator of the target architecture
- Literature
study in the field of architecture simulation
- An
evaluation of the three simulation environments with respect to
expressiveness, ease-of -use, performance, etc.
References:
This thesis project is
suitable for one or two students who are interested in computer
architecture
and performance evaluation. Knowledge corresponding to IS2202 (Computer
systems
architecture) is required.
The thesis project can
start any time.
Your
own proposal here...
·
Suitable
for one or two students with a background in computer architecture
and/or
parallel computer systems
·
This
project is not sponsored
·
Proposed
starting date: Any time
You are welcome to define
your own project in consultation with me or a suitable advisor.
Mats
Brorsson, 19 Jan 2012