Timeline and Temporal Analysis¶
Timeline and temporal analysis of parliamentary data reveals patterns, trends, and insights that span decades of Danish legislative history. With 74+ years of parliamentary data available through the OData API, you can conduct sophisticated chronological analyses of legislative processes, voting behaviors, and political dynamics.
Overview¶
The Danish Parliamentary OData API provides exceptional temporal depth, with structured period data from 1952 to 2026 and active real-time updates. This enables comprehensive timeline analysis across multiple dimensions:
- Legislative Process Evolution: Track how bills progress through parliament over time
- Voting Pattern Analysis: Identify temporal trends in political party alignment
- Actor Career Trajectories: Follow politicians' voting records and activity patterns
- Seasonal Parliamentary Rhythms: Discover periodic patterns in legislative activity
- Historical Trend Analysis: Compare political dynamics across decades
- Event Impact Assessment: Analyze how external events affect parliamentary behavior
Timeline Analysis Opportunities¶
1. Data Temporal Structure¶
The API organizes temporal data through several key timestamp fields:
# Key temporal fields across entities
temporal_fields = {
'opdateringsdato': 'Last system update (all entities)',
'dato': 'Event/meeting date (Møde, Dokument)',
'afgørelsesdato': 'Decision date (Sag)',
'startdato/slutdato': 'Period boundaries (Periode, Aktør)',
'lovnummerdato': 'Law publication date (Sag)',
'rådsmødedato': 'Council meeting date (specialized meetings)'
}
2. Historical Coverage¶
Period Structure (1952-2026): - 165+ distinct parliamentary periods with precise boundaries - Annual parliamentary sessions with standardized start/end dates - Forward-planning capability with future periods pre-defined
Data Availability Patterns: - Complete period metadata back to 1952 - Content data varies by entity type and historical significance - System timestamps reflect API deployment (circa 2014), not original document dates
3. Real-Time Update Monitoring¶
The API provides near real-time updates during active parliamentary sessions:
import requests
from datetime import datetime, timedelta
import pandas as pd
def monitor_recent_activity(hours_back=24):
"""Monitor parliamentary activity in the last N hours"""
cutoff = datetime.now() - timedelta(hours=hours_back)
cutoff_str = cutoff.strftime('%Y-%m-%dT%H:%M:%S')
# Check recent updates across key entities
entities = ['Sag', 'Afstemning', 'Møde', 'Dokument']
activity_data = []
for entity in entities:
url = f"https://oda.ft.dk/api/{entity}"
params = {
'$filter': f"opdateringsdato gt datetime'{cutoff_str}'",
'$orderby': 'opdateringsdato desc',
'$top': 50,
'$select': 'id,titel,opdateringsdato'
}
response = requests.get(url, params=params)
if response.status_code == 200:
data = response.json()
for item in data['value']:
activity_data.append({
'entity': entity,
'id': item['id'],
'title': item.get('titel', 'N/A')[:100],
'updated': item['opdateringsdato']
})
return pd.DataFrame(activity_data)
# Usage
recent_activity = monitor_recent_activity(hours_back=8)
print(f"Found {len(recent_activity)} recent updates")
recent_activity.head()
Temporal Patterns in Legislative Processes¶
1. Case Lifecycle Analysis¶
Parliamentary cases follow predictable temporal patterns that can be analyzed for process efficiency and duration trends:
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import numpy as np
def analyze_case_duration_patterns():
"""Analyze how long cases take to complete by category"""
# Fetch completed cases with decision dates
url = "https://oda.ft.dk/api/Sag"
params = {
'$filter': "afgørelsesdato ne null and afgørelsesdato gt datetime'2020-01-01'",
'$expand': 'Sagskategori',
'$select': 'id,titel,afgørelsesdato,opdateringsdato,kategoriid,periodeid',
'$top': 1000
}
response = requests.get(url, params=params)
data = response.json()
case_durations = []
for case in data['value']:
if case.get('afgørelsesdato'):
# Calculate duration from creation to decision
decision_date = datetime.fromisoformat(case['afgørelsesdato'].replace('Z', '+00:00'))
update_date = datetime.fromisoformat(case['opdateringsdato'].replace('Z', '+00:00'))
# Note: Using opdateringsdato as proxy for creation (limitation)
duration_days = (decision_date - update_date).days
case_durations.append({
'case_id': case['id'],
'title': case['titel'][:50],
'duration_days': abs(duration_days), # Handle negative durations
'category_id': case.get('kategoriid'),
'period_id': case['periodeid'],
'decision_date': decision_date
})
df = pd.DataFrame(case_durations)
# Visualize duration patterns
plt.figure(figsize=(15, 10))
# Duration distribution
plt.subplot(2, 2, 1)
plt.hist(df['duration_days'], bins=50, alpha=0.7, edgecolor='black')
plt.title('Case Duration Distribution')
plt.xlabel('Duration (days)')
plt.ylabel('Frequency')
# Duration by category
plt.subplot(2, 2, 2)
category_durations = df.groupby('category_id')['duration_days'].mean().sort_values(ascending=True)
category_durations.plot(kind='barh')
plt.title('Average Duration by Category')
plt.xlabel('Average Duration (days)')
# Temporal trend
plt.subplot(2, 2, 3)
df['decision_month'] = df['decision_date'].dt.to_period('M')
monthly_avg = df.groupby('decision_month')['duration_days'].mean()
monthly_avg.plot()
plt.title('Duration Trends Over Time')
plt.ylabel('Average Duration (days)')
plt.xticks(rotation=45)
# Duration vs complexity (title length as proxy)
plt.subplot(2, 2, 4)
df['title_length'] = df['title'].str.len()
plt.scatter(df['title_length'], df['duration_days'], alpha=0.6)
plt.xlabel('Title Length (characters)')
plt.ylabel('Duration (days)')
plt.title('Complexity vs Duration')
plt.tight_layout()
plt.show()
return df
# Analyze case durations
duration_analysis = analyze_case_duration_patterns()
2. Legislative Process Flow Timing¶
Track how cases progress through different stages of the parliamentary process:
def analyze_legislative_flow_timing():
"""Analyze timing patterns in legislative flow"""
# Get cases with their steps (sagsforløb)
url = "https://oda.ft.dk/api/Sag"
params = {
'$expand': 'Sagstrin',
'$filter': "periodeid gt 140", # Recent periods
'$select': 'id,titel,periodeid,afgørelsesdato',
'$top': 500
}
response = requests.get(url, params=params)
data = response.json()
flow_data = []
for case in data['value']:
if 'Sagstrin' in case and case['Sagstrin']:
steps = case['Sagstrin']
for i, step in enumerate(steps):
flow_data.append({
'case_id': case['id'],
'step_number': i + 1,
'step_id': step['id'],
'step_date': step.get('dato'),
'step_title': step.get('titel', ''),
'total_steps': len(steps),
'case_title': case['titel'][:50]
})
df = pd.DataFrame(flow_data)
# Analyze step timing patterns
step_analysis = df.groupby('step_number').agg({
'case_id': 'count',
'step_title': lambda x: x.mode().iloc[0] if not x.empty else 'Unknown'
}).rename(columns={'case_id': 'frequency'})
print("Most Common Legislative Steps by Position:")
print(step_analysis.head(10))
# Visualize step frequency
plt.figure(figsize=(12, 6))
step_analysis['frequency'].plot(kind='bar')
plt.title('Frequency of Legislative Steps by Position')
plt.xlabel('Step Number in Process')
plt.ylabel('Frequency')
plt.xticks(rotation=0)
plt.show()
return df
# Analyze legislative flow
flow_timing = analyze_legislative_flow_timing()
Voting Timeline Analysis and Trend Detection¶
1. Party Voting Alignment Over Time¶
Analyze how party voting patterns change over time and identify periods of increased or decreased political alignment:
def analyze_party_voting_evolution():
"""Analyze evolution of party voting patterns over time"""
# Get voting data with party information
url = "https://oda.ft.dk/api/Afstemning"
params = {
'$expand': 'Stemme/Aktør',
'$filter': "year(opdateringsdato) ge 2020",
'$orderby': 'opdateringsdato desc',
'$top': 200
}
response = requests.get(url, params=params)
data = response.json()
voting_patterns = []
for vote_session in data['value']:
if 'Stemme' in vote_session and vote_session['Stemme']:
session_date = vote_session.get('opdateringsdato')
for vote in vote_session['Stemme']:
if 'Aktør' in vote and vote['Aktør']:
actor = vote['Aktør']
voting_patterns.append({
'session_id': vote_session['id'],
'session_date': session_date,
'actor_id': actor['id'],
'actor_name': actor.get('navn', ''),
'party_group': actor.get('gruppenavnkort', 'Unknown'),
'vote_type': vote.get('typeid'),
'vote_result': vote_session.get('vedtaget', None)
})
df = pd.DataFrame(voting_patterns)
# Analyze party alignment trends
if not df.empty:
df['session_month'] = pd.to_datetime(df['session_date']).dt.to_period('M')
# Calculate party agreement rates by month
monthly_party_votes = df.groupby(['session_month', 'party_group', 'vote_result']).size().reset_index(name='vote_count')
# Visualize party voting trends
plt.figure(figsize=(15, 8))
for party in df['party_group'].value_counts().head(5).index:
party_data = monthly_party_votes[monthly_party_votes['party_group'] == party]
if not party_data.empty:
party_monthly = party_data.groupby('session_month')['vote_count'].sum()
party_monthly.plot(label=party, marker='o')
plt.title('Party Voting Activity Over Time')
plt.xlabel('Month')
plt.ylabel('Number of Votes')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
return df
# Analyze party voting evolution
party_evolution = analyze_party_voting_evolution()
2. Individual Politician Voting Patterns¶
Track individual politicians' voting behavior and career trajectories:
def analyze_politician_voting_timeline(actor_id=None, party_group=None):
"""Analyze individual politician voting patterns over time"""
if actor_id:
filter_clause = f"Stemme/any(s: s/aktørid eq {actor_id})"
elif party_group:
filter_clause = f"Stemme/any(s: s/Aktør/gruppenavnkort eq '{party_group}')"
else:
filter_clause = "year(opdateringsdato) ge 2020"
url = "https://oda.ft.dk/api/Afstemning"
params = {
'$expand': 'Stemme/Aktør,Sag',
'$filter': filter_clause,
'$orderby': 'opdateringsdato asc',
'$top': 500
}
response = requests.get(url, params=params)
data = response.json()
politician_votes = []
for vote_session in data['value']:
session_info = {
'session_id': vote_session['id'],
'session_date': vote_session.get('opdateringsdato'),
'case_title': vote_session.get('Sag', {}).get('titel', 'Unknown') if 'Sag' in vote_session else 'Unknown',
'session_result': vote_session.get('vedtaget')
}
if 'Stemme' in vote_session:
for vote in vote_session['Stemme']:
if 'Aktør' in vote and vote['Aktør']:
actor = vote['Aktør']
politician_votes.append({
**session_info,
'actor_id': actor['id'],
'actor_name': actor.get('navn', ''),
'party_group': actor.get('gruppenavnkort', ''),
'vote_type': vote.get('typeid'),
'individual_vote': vote.get('type', 'Unknown')
})
df = pd.DataFrame(politician_votes)
if not df.empty:
# Create voting timeline visualization
df['session_date'] = pd.to_datetime(df['session_date'])
# Individual politician timeline
if actor_id and len(df) > 0:
politician_name = df['actor_name'].iloc[0]
plt.figure(figsize=(15, 8))
# Plot voting activity over time
monthly_activity = df.set_index('session_date').resample('M').size()
monthly_activity.plot(kind='line', marker='o')
plt.title(f'Voting Activity Timeline - {politician_name}')
plt.ylabel('Number of Votes')
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.show()
# Party comparison timeline
elif party_group:
plt.figure(figsize=(15, 10))
# Vote type distribution over time
vote_timeline = df.groupby([df['session_date'].dt.to_period('M'), 'individual_vote']).size().unstack(fill_value=0)
vote_timeline.plot(kind='area', stacked=True, alpha=0.7)
plt.title(f'Vote Type Distribution Over Time - {party_group}')
plt.ylabel('Number of Votes')
plt.xlabel('Month')
plt.legend(title='Vote Type', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()
return df
# Example usage
# Individual politician analysis
politician_timeline = analyze_politician_voting_timeline(actor_id=5) # Replace with actual actor ID
# Party group analysis
party_timeline = analyze_politician_voting_timeline(party_group="V") # Venstre party
Actor Activity Patterns Over Time¶
1. Career Progression Analysis¶
Analyze how politicians' roles and activity levels change throughout their careers:
def analyze_career_progression():
"""Analyze politician career progression and activity patterns"""
url = "https://oda.ft.dk/api/Aktør"
params = {
'$filter': "startdato ne null and slutdato ne null",
'$select': 'id,navn,gruppenavnkort,startdato,slutdato,typeid,periodeid,biografi',
'$top': 500
}
response = requests.get(url, params=params)
data = response.json()
career_data = []
for actor in data['value']:
if actor.get('startdato') and actor.get('slutdato'):
start_date = datetime.fromisoformat(actor['startdato'].replace('Z', '+00:00'))
end_date = datetime.fromisoformat(actor['slutdato'].replace('Z', '+00:00'))
tenure_days = (end_date - start_date).days
career_data.append({
'actor_id': actor['id'],
'name': actor.get('navn', ''),
'party': actor.get('gruppenavnkort', ''),
'start_date': start_date,
'end_date': end_date,
'tenure_days': tenure_days,
'tenure_years': tenure_days / 365.25,
'actor_type': actor.get('typeid'),
'period_id': actor.get('periodeid'),
'has_biography': len(actor.get('biografi', '')) > 50
})
df = pd.DataFrame(career_data)
# Career progression analysis
plt.figure(figsize=(15, 12))
# Tenure distribution
plt.subplot(2, 2, 1)
plt.hist(df['tenure_years'], bins=30, alpha=0.7, edgecolor='black')
plt.title('Career Tenure Distribution')
plt.xlabel('Tenure (years)')
plt.ylabel('Frequency')
# Tenure by party
plt.subplot(2, 2, 2)
party_tenure = df.groupby('party')['tenure_years'].mean().sort_values(ascending=True)
party_tenure.tail(10).plot(kind='barh')
plt.title('Average Tenure by Party (Top 10)')
plt.xlabel('Average Tenure (years)')
# Career start timeline
plt.subplot(2, 2, 3)
df['start_year'] = df['start_date'].dt.year
yearly_starts = df.groupby('start_year').size()
yearly_starts.plot()
plt.title('New Political Careers Started by Year')
plt.xlabel('Year')
plt.ylabel('Number of New Politicians')
# Active periods heatmap
plt.subplot(2, 2, 4)
df['start_decade'] = (df['start_year'] // 10) * 10
df['end_decade'] = (df['end_date'].dt.year // 10) * 10
decade_activity = df.groupby(['start_decade', 'end_decade']).size().unstack(fill_value=0)
sns.heatmap(decade_activity, annot=True, fmt='d', cmap='YlOrRd')
plt.title('Career Spans by Decade')
plt.xlabel('End Decade')
plt.ylabel('Start Decade')
plt.tight_layout()
plt.show()
return df
# Analyze career progression
career_progression = analyze_career_progression()
2. Activity Level Measurement¶
Measure and visualize politician activity levels across different dimensions:
def measure_actor_activity_levels():
"""Measure politician activity across different parliamentary functions"""
# Get voting activity
voting_url = "https://oda.ft.dk/api/Stemme"
voting_params = {
'$expand': 'Aktør',
'$filter': "year(opdateringsdato) ge 2020",
'$select': 'aktørid,opdateringsdato',
'$top': 2000
}
voting_response = requests.get(voting_url, params=voting_params)
voting_data = voting_response.json()
# Get case involvement
case_url = "https://oda.ft.dk/api/SagAktør"
case_params = {
'$expand': 'Aktør',
'$filter': "year(opdateringsdato) ge 2020",
'$select': 'aktørid,rolleid,opdateringsdato',
'$top': 2000
}
case_response = requests.get(case_url, params=case_params)
case_data = case_response.json()
# Calculate activity metrics
activity_metrics = {}
# Voting activity
for vote in voting_data.get('value', []):
actor_id = vote.get('aktørid')
if actor_id:
if actor_id not in activity_metrics:
activity_metrics[actor_id] = {
'voting_count': 0,
'case_involvement': 0,
'actor_name': 'Unknown'
}
activity_metrics[actor_id]['voting_count'] += 1
# Case involvement
for involvement in case_data.get('value', []):
actor_id = involvement.get('aktørid')
if actor_id:
if actor_id not in activity_metrics:
activity_metrics[actor_id] = {
'voting_count': 0,
'case_involvement': 0,
'actor_name': 'Unknown'
}
activity_metrics[actor_id]['case_involvement'] += 1
# Get actor name if available
if 'Aktør' in involvement and involvement['Aktør']:
activity_metrics[actor_id]['actor_name'] = involvement['Aktør'].get('navn', 'Unknown')
# Convert to DataFrame for analysis
activity_df = pd.DataFrame.from_dict(activity_metrics, orient='index')
activity_df.index.name = 'actor_id'
activity_df.reset_index(inplace=True)
# Calculate composite activity score
activity_df['total_activity'] = activity_df['voting_count'] + activity_df['case_involvement']
# Visualize activity patterns
plt.figure(figsize=(15, 10))
# Activity distribution
plt.subplot(2, 2, 1)
plt.hist(activity_df['total_activity'], bins=30, alpha=0.7, edgecolor='black')
plt.title('Total Activity Distribution')
plt.xlabel('Total Activity Score')
plt.ylabel('Number of Politicians')
# Voting vs case involvement
plt.subplot(2, 2, 2)
plt.scatter(activity_df['voting_count'], activity_df['case_involvement'], alpha=0.6)
plt.xlabel('Voting Activity')
plt.ylabel('Case Involvement')
plt.title('Voting Activity vs Case Involvement')
# Top active politicians
plt.subplot(2, 2, 3)
top_active = activity_df.nlargest(15, 'total_activity')
plt.barh(range(len(top_active)), top_active['total_activity'])
plt.yticks(range(len(top_active)), [name[:20] + '...' if len(name) > 20 else name
for name in top_active['actor_name']])
plt.title('Most Active Politicians')
plt.xlabel('Total Activity Score')
# Activity correlation
plt.subplot(2, 2, 4)
correlation = activity_df['voting_count'].corr(activity_df['case_involvement'])
plt.text(0.5, 0.5, f'Voting-Case Correlation:\n{correlation:.3f}',
ha='center', va='center', fontsize=16,
bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Activity Correlation')
plt.tight_layout()
plt.show()
return activity_df
# Measure activity levels
activity_analysis = measure_actor_activity_levels()
Seasonal and Periodic Patterns¶
1. Parliamentary Session Rhythms¶
Identify seasonal patterns in parliamentary activity and legislation timing:
def analyze_seasonal_patterns():
"""Analyze seasonal and periodic patterns in parliamentary activity"""
# Get meeting data with dates
url = "https://oda.ft.dk/api/Møde"
params = {
'$filter': "dato ge datetime'2018-01-01'",
'$select': 'id,titel,dato,statusid,nummer',
'$orderby': 'dato asc',
'$top': 2000
}
response = requests.get(url, params=params)
data = response.json()
meeting_data = []
for meeting in data['value']:
if meeting.get('dato'):
meeting_date = datetime.fromisoformat(meeting['dato'].replace('Z', '+00:00'))
meeting_data.append({
'meeting_id': meeting['id'],
'title': meeting.get('titel', ''),
'date': meeting_date,
'year': meeting_date.year,
'month': meeting_date.month,
'weekday': meeting_date.weekday(), # 0=Monday
'quarter': (meeting_date.month - 1) // 3 + 1,
'day_of_year': meeting_date.timetuple().tm_yday,
'status_id': meeting.get('statusid')
})
df = pd.DataFrame(meeting_data)
# Seasonal analysis visualization
plt.figure(figsize=(20, 15))
# Monthly distribution
plt.subplot(3, 3, 1)
monthly_meetings = df.groupby('month').size()
monthly_meetings.plot(kind='bar')
plt.title('Parliamentary Meetings by Month')
plt.xlabel('Month')
plt.ylabel('Number of Meetings')
plt.xticks(range(12), ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], rotation=45)
# Quarterly pattern
plt.subplot(3, 3, 2)
quarterly_meetings = df.groupby('quarter').size()
quarterly_meetings.plot(kind='bar', color=['skyblue', 'lightgreen', 'coral', 'gold'])
plt.title('Meetings by Quarter')
plt.xlabel('Quarter')
plt.ylabel('Number of Meetings')
plt.xticks(range(4), ['Q1', 'Q2', 'Q3', 'Q4'], rotation=0)
# Weekly pattern
plt.subplot(3, 3, 3)
weekly_meetings = df.groupby('weekday').size()
weekly_meetings.plot(kind='bar', color='lightcoral')
plt.title('Meetings by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Number of Meetings')
plt.xticks(range(7), ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'], rotation=45)
# Annual trend
plt.subplot(3, 3, 4)
yearly_meetings = df.groupby('year').size()
yearly_meetings.plot(kind='line', marker='o')
plt.title('Annual Meeting Trends')
plt.xlabel('Year')
plt.ylabel('Number of Meetings')
# Heatmap: Month vs Year
plt.subplot(3, 3, 5)
month_year_pivot = df.groupby(['year', 'month']).size().unstack(fill_value=0)
sns.heatmap(month_year_pivot.T, cmap='YlOrRd', annot=False, fmt='d')
plt.title('Activity Heatmap (Year vs Month)')
plt.ylabel('Month')
plt.xlabel('Year')
# Day of year distribution (showing seasonal patterns)
plt.subplot(3, 3, 6)
plt.hist(df['day_of_year'], bins=52, alpha=0.7, edgecolor='black')
plt.title('Activity Distribution Across Year')
plt.xlabel('Day of Year')
plt.ylabel('Frequency')
# Parliamentary recess identification
plt.subplot(3, 3, 7)
weekly_activity = df.groupby(df['date'].dt.isocalendar().week).size()
weekly_activity.plot(kind='line')
plt.title('Weekly Activity Pattern')
plt.xlabel('Week of Year')
plt.ylabel('Number of Meetings')
# Seasonal boxplot
plt.subplot(3, 3, 8)
season_map = {12: 'Winter', 1: 'Winter', 2: 'Winter',
3: 'Spring', 4: 'Spring', 5: 'Spring',
6: 'Summer', 7: 'Summer', 8: 'Summer',
9: 'Autumn', 10: 'Autumn', 11: 'Autumn'}
df['season'] = df['month'].map(season_map)
season_activity = df.groupby(['season', 'year']).size().reset_index(name='meetings')
sns.boxplot(data=season_activity, x='season', y='meetings')
plt.title('Seasonal Activity Variation')
plt.xlabel('Season')
plt.ylabel('Meetings per Year')
plt.xticks(rotation=45)
# Parliament efficiency (meetings per day of session)
plt.subplot(3, 3, 9)
# Identify non-weekend, non-holiday days (approximation)
business_days = df[df['weekday'] < 5] # Monday-Friday
monthly_efficiency = business_days.groupby('month').size() / business_days.groupby('month')['date'].nunique()
monthly_efficiency.plot(kind='bar', color='steelblue')
plt.title('Parliamentary Efficiency by Month')
plt.xlabel('Month')
plt.ylabel('Meetings per Business Day')
plt.xticks(range(12), ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], rotation=45)
plt.tight_layout()
plt.show()
return df
# Analyze seasonal patterns
seasonal_analysis = analyze_seasonal_patterns()
2. Legislative Cycle Analysis¶
Analyze how legislative activity correlates with parliamentary periods and electoral cycles:
def analyze_legislative_cycles():
"""Analyze legislative activity patterns across parliamentary periods"""
# Get period data
periods_url = "https://oda.ft.dk/api/Periode"
periods_response = requests.get(periods_url, params={'$top': 200})
periods_data = periods_response.json()
# Get case data with periods
cases_url = "https://oda.ft.dk/api/Sag"
cases_params = {
'$filter': "periodeid ge 140", # Recent periods
'$select': 'id,titel,periodeid,afgørelsesdato,opdateringsdato,kategoriid',
'$top': 2000
}
cases_response = requests.get(cases_url, params=cases_params)
cases_data = cases_response.json()
# Process period information
periods_df = pd.DataFrame(periods_data['value'])
periods_df['startdato'] = pd.to_datetime(periods_df['startdato'])
periods_df['slutdato'] = pd.to_datetime(periods_df['slutdato'])
periods_df['duration_days'] = (periods_df['slutdato'] - periods_df['startdato']).dt.days
# Process case information
cases_df = pd.DataFrame(cases_data['value'])
cases_df['afgørelsesdato'] = pd.to_datetime(cases_df['afgørelsesdato'])
cases_df['opdateringsdato'] = pd.to_datetime(cases_df['opdateringsdato'])
# Merge with period data
legislative_activity = cases_df.merge(periods_df, left_on='periodeid', right_on='id',
how='inner', suffixes=('_case', '_period'))
# Calculate activity metrics per period
period_metrics = legislative_activity.groupby('periodeid').agg({
'id_case': 'count',
'afgørelsesdato': 'count',
'duration_days': 'first',
'startdato': 'first',
'slutdato': 'first',
'titel_period': 'first'
}).rename(columns={'id_case': 'total_cases', 'afgørelsesdato': 'decided_cases'})
period_metrics['decision_rate'] = period_metrics['decided_cases'] / period_metrics['total_cases']
period_metrics['cases_per_day'] = period_metrics['total_cases'] / period_metrics['duration_days']
# Visualize legislative cycles
plt.figure(figsize=(18, 12))
# Cases per period
plt.subplot(2, 3, 1)
period_metrics['total_cases'].plot(kind='bar')
plt.title('Total Cases by Parliamentary Period')
plt.xlabel('Period ID')
plt.ylabel('Number of Cases')
plt.xticks(rotation=45)
# Decision efficiency by period
plt.subplot(2, 3, 2)
period_metrics['decision_rate'].plot(kind='bar', color='green')
plt.title('Decision Rate by Period')
plt.xlabel('Period ID')
plt.ylabel('Proportion of Cases Decided')
plt.xticks(rotation=45)
# Activity intensity (cases per day)
plt.subplot(2, 3, 3)
period_metrics['cases_per_day'].plot(kind='bar', color='orange')
plt.title('Legislative Intensity by Period')
plt.xlabel('Period ID')
plt.ylabel('Cases per Day')
plt.xticks(rotation=45)
# Timeline of legislative activity
plt.subplot(2, 3, 4)
timeline_data = period_metrics.reset_index()
plt.scatter(timeline_data['startdato'], timeline_data['total_cases'],
s=timeline_data['duration_days']/5, alpha=0.6, c=timeline_data['decision_rate'],
cmap='viridis')
plt.colorbar(label='Decision Rate')
plt.title('Legislative Activity Timeline')
plt.xlabel('Period Start Date')
plt.ylabel('Total Cases')
plt.xticks(rotation=45)
# Period duration vs activity
plt.subplot(2, 3, 5)
plt.scatter(period_metrics['duration_days'], period_metrics['total_cases'], alpha=0.7)
plt.xlabel('Period Duration (days)')
plt.ylabel('Total Cases')
plt.title('Period Duration vs Legislative Activity')
# Case category distribution over time
plt.subplot(2, 3, 6)
category_timeline = legislative_activity.groupby(['startdato', 'kategoriid']).size().unstack(fill_value=0)
category_timeline.plot(kind='area', stacked=True, alpha=0.7)
plt.title('Case Categories Over Time')
plt.xlabel('Period Start')
plt.ylabel('Number of Cases')
plt.legend(title='Category ID', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
return period_metrics, legislative_activity
# Analyze legislative cycles
cycle_metrics, cycle_activity = analyze_legislative_cycles()
Historical Trend Analysis Across Decades¶
1. Long-term Political Evolution¶
Analyze how Danish politics has evolved over decades using the extensive historical data:
def analyze_historical_political_evolution():
"""Analyze long-term trends in Danish parliamentary politics"""
# Get historical period data
periods_url = "https://oda.ft.dk/api/Periode"
periods_params = {
'$orderby': 'startdato asc',
'$top': 200
}
periods_response = requests.get(periods_url, params=periods_params)
periods_data = periods_response.json()
# Get actor data across all periods
actors_url = "https://oda.ft.dk/api/Aktør"
actors_params = {
'$filter': "startdato ne null",
'$select': 'id,navn,gruppenavnkort,startdato,slutdato,typeid,periodeid',
'$top': 2000
}
actors_response = requests.get(actors_url, params=actors_params)
actors_data = actors_response.json()
# Process historical data
periods_df = pd.DataFrame(periods_data['value'])
periods_df['startdato'] = pd.to_datetime(periods_df['startdato'])
periods_df['slutdato'] = pd.to_datetime(periods_df['slutdato'])
periods_df['decade'] = (periods_df['startdato'].dt.year // 10) * 10
actors_df = pd.DataFrame(actors_data['value'])
actors_df['startdato'] = pd.to_datetime(actors_df['startdato'])
actors_df['slutdato'] = pd.to_datetime(actors_df['slutdato'], errors='coerce')
actors_df['start_decade'] = (actors_df['startdato'].dt.year // 10) * 10
# Historical trend analysis
plt.figure(figsize=(20, 15))
# Periods per decade
plt.subplot(3, 3, 1)
decade_periods = periods_df.groupby('decade').size()
decade_periods.plot(kind='bar')
plt.title('Parliamentary Periods by Decade')
plt.xlabel('Decade')
plt.ylabel('Number of Periods')
plt.xticks(rotation=45)
# Political party evolution
plt.subplot(3, 3, 2)
party_decade = actors_df.groupby(['start_decade', 'gruppenavnkort']).size().unstack(fill_value=0)
# Show top parties only
top_parties = actors_df['gruppenavnkort'].value_counts().head(8).index
party_decade_filtered = party_decade[top_parties]
party_decade_filtered.plot(kind='bar', stacked=True)
plt.title('Party Representation by Decade')
plt.xlabel('Decade')
plt.ylabel('Number of Politicians')
plt.legend(title='Party', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=45)
# Career length evolution
plt.subplot(3, 3, 3)
actors_with_end = actors_df.dropna(subset=['slutdato'])
actors_with_end['career_years'] = (actors_with_end['slutdato'] - actors_with_end['startdato']).dt.days / 365.25
career_by_decade = actors_with_end.groupby('start_decade')['career_years'].mean()
career_by_decade.plot(kind='line', marker='o')
plt.title('Average Career Length by Start Decade')
plt.xlabel('Start Decade')
plt.ylabel('Average Career Length (years)')
# Period duration trends
plt.subplot(3, 3, 4)
periods_df['duration_years'] = (periods_df['slutdato'] - periods_df['startdato']).dt.days / 365.25
period_duration_by_decade = periods_df.groupby('decade')['duration_years'].mean()
period_duration_by_decade.plot(kind='line', marker='s', color='red')
plt.title('Average Period Duration by Decade')
plt.xlabel('Decade')
plt.ylabel('Average Duration (years)')
# Political diversity index (number of unique parties)
plt.subplot(3, 3, 5)
diversity_by_decade = actors_df.groupby('start_decade')['gruppenavnkort'].nunique()
diversity_by_decade.plot(kind='bar', color='purple')
plt.title('Political Diversity by Decade')
plt.xlabel('Decade')
plt.ylabel('Number of Unique Parties')
plt.xticks(rotation=45)
# New politician entries over time
plt.subplot(3, 3, 6)
new_politicians = actors_df.groupby(actors_df['startdato'].dt.year).size()
new_politicians.plot(kind='line')
plt.title('New Politicians Entering Parliament by Year')
plt.xlabel('Year')
plt.ylabel('New Politicians')
# Party stability analysis
plt.subplot(3, 3, 7)
# Calculate party presence across decades
party_presence = actors_df.groupby('gruppenavnkort')['start_decade'].nunique().sort_values(ascending=False)
party_presence.head(15).plot(kind='barh')
plt.title('Party Longevity (Decades of Presence)')
plt.xlabel('Number of Decades Present')
# Historical timeline visualization
plt.subplot(3, 3, 8)
periods_timeline = periods_df.set_index('startdato')['titel'].str[:20] # Truncate titles
# Show sample of periods
sample_periods = periods_timeline.iloc[::10] # Every 10th period
plt.scatter(sample_periods.index, range(len(sample_periods)), alpha=0.7)
plt.title('Historical Parliamentary Timeline (Sample)')
plt.xlabel('Year')
plt.ylabel('Period Index')
plt.xticks(rotation=45)
# Summary statistics box
plt.subplot(3, 3, 9)
stats_text = f"""
Historical Coverage:
" Total Periods: {len(periods_df)}
" Date Range: {periods_df['startdato'].min().year}-{periods_df['startdato'].max().year}
" Total Politicians: {len(actors_df)}
" Unique Parties: {actors_df['gruppenavnkort'].nunique()}
" Average Period: {periods_df['duration_years'].mean():.1f} years
" Average Career: {actors_with_end['career_years'].mean():.1f} years
"""
plt.text(0.1, 0.5, stats_text, fontsize=10, verticalalignment='center',
bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Historical Summary')
plt.tight_layout()
plt.show()
return periods_df, actors_df
# Analyze historical evolution
historical_periods, historical_actors = analyze_historical_political_evolution()
2. Comparative Decade Analysis¶
Compare parliamentary characteristics across different decades:
def comparative_decade_analysis():
"""Compare parliamentary characteristics across decades"""
# Combine multiple data sources for comprehensive decade comparison
datasets = {}
# Get case data
cases_url = "https://oda.ft.dk/api/Sag"
cases_params = {
'$expand': 'Periode',
'$select': 'id,titel,periodeid,afgørelsesdato,kategoriid',
'$top': 2000
}
cases_response = requests.get(cases_url, params=cases_params)
cases_data = cases_response.json()
if cases_data.get('value'):
cases_df = pd.DataFrame(cases_data['value'])
# Extract decade from period data
for case in cases_data['value']:
if 'Periode' in case and case['Periode']:
period = case['Periode']
if period.get('startdato'):
start_year = datetime.fromisoformat(period['startdato'].replace('Z', '+00:00')).year
case['decade'] = (start_year // 10) * 10
cases_df = pd.DataFrame(cases_data['value'])
cases_df = cases_df[cases_df['decade'].notna()]
# Comparative analysis by decade
decade_comparison = {}
for decade in sorted(cases_df['decade'].unique()):
decade_cases = cases_df[cases_df['decade'] == decade]
decade_comparison[decade] = {
'total_cases': len(decade_cases),
'decided_cases': len(decade_cases[decade_cases['afgørelsesdato'].notna()]),
'categories': decade_cases['kategoriid'].nunique(),
'decision_rate': len(decade_cases[decade_cases['afgørelsesdato'].notna()]) / len(decade_cases)
}
comparison_df = pd.DataFrame.from_dict(decade_comparison, orient='index')
# Visualization
plt.figure(figsize=(15, 10))
# Cases by decade
plt.subplot(2, 2, 1)
comparison_df['total_cases'].plot(kind='bar')
plt.title('Total Cases by Decade')
plt.xlabel('Decade')
plt.ylabel('Number of Cases')
plt.xticks(rotation=45)
# Decision rate trends
plt.subplot(2, 2, 2)
comparison_df['decision_rate'].plot(kind='line', marker='o', color='green')
plt.title('Decision Rate Trends')
plt.xlabel('Decade')
plt.ylabel('Decision Rate')
plt.ylim(0, 1)
# Legislative complexity (categories)
plt.subplot(2, 2, 3)
comparison_df['categories'].plot(kind='bar', color='orange')
plt.title('Legislative Complexity by Decade')
plt.xlabel('Decade')
plt.ylabel('Number of Case Categories')
plt.xticks(rotation=45)
# Comparative heatmap
plt.subplot(2, 2, 4)
# Normalize for comparison
normalized_comparison = (comparison_df - comparison_df.min()) / (comparison_df.max() - comparison_df.min())
sns.heatmap(normalized_comparison.T, annot=True, fmt='.2f', cmap='RdYlBu_r')
plt.title('Normalized Decade Comparison')
plt.xlabel('Decade')
plt.tight_layout()
plt.show()
return comparison_df
return None
# Perform comparative analysis
decade_comparison = comparative_decade_analysis()
Event Detection and Anomaly Identification¶
1. Parliamentary Activity Anomaly Detection¶
Identify unusual patterns or events in parliamentary activity that may indicate significant political events:
import numpy as np
from scipy import stats
from sklearn.preprocessing import StandardScaler
def detect_parliamentary_anomalies():
"""Detect anomalous patterns in parliamentary activity"""
# Get comprehensive activity data
url = "https://oda.ft.dk/api/Møde"
params = {
'$filter': "dato ge datetime'2015-01-01'",
'$select': 'id,dato,titel,statusid',
'$orderby': 'dato asc',
'$top': 5000
}
response = requests.get(url, params=params)
data = response.json()
# Process meeting data for anomaly detection
meetings = []
for meeting in data['value']:
if meeting.get('dato'):
meeting_date = datetime.fromisoformat(meeting['dato'].replace('Z', '+00:00'))
meetings.append({
'date': meeting_date,
'year': meeting_date.year,
'month': meeting_date.month,
'week': meeting_date.isocalendar()[1],
'weekday': meeting_date.weekday(),
'title': meeting.get('titel', '')
})
df = pd.DataFrame(meetings)
# Calculate activity metrics for anomaly detection
weekly_activity = df.groupby(['year', 'week']).size().reset_index(name='meeting_count')
weekly_activity['date_week'] = pd.to_datetime(weekly_activity[['year']].assign(week=weekly_activity['week'], day=1)) + pd.to_timedelta((weekly_activity['week'] - 1) * 7, unit='d')
# Detect statistical anomalies
scaler = StandardScaler()
activity_scaled = scaler.fit_transform(weekly_activity[['meeting_count']])
# Z-score based anomaly detection
z_scores = np.abs(stats.zscore(weekly_activity['meeting_count']))
anomaly_threshold = 2.5 # Standard deviations
anomalies = weekly_activity[z_scores > anomaly_threshold]
# Seasonal decomposition for trend analysis
weekly_activity_ts = weekly_activity.set_index('date_week')['meeting_count']
# Rolling statistics for change detection
weekly_activity['rolling_mean'] = weekly_activity['meeting_count'].rolling(window=8, center=True).mean()
weekly_activity['rolling_std'] = weekly_activity['meeting_count'].rolling(window=8, center=True).std()
# Identify sudden changes
weekly_activity['change_indicator'] = abs(weekly_activity['meeting_count'] - weekly_activity['rolling_mean']) / weekly_activity['rolling_std']
significant_changes = weekly_activity[weekly_activity['change_indicator'] > 2.0]
# Visualization
plt.figure(figsize=(18, 12))
# Time series with anomalies
plt.subplot(3, 2, 1)
plt.plot(weekly_activity['date_week'], weekly_activity['meeting_count'], alpha=0.7, label='Weekly Meetings')
plt.plot(weekly_activity['date_week'], weekly_activity['rolling_mean'], color='red', label='8-week Moving Average')
plt.scatter(anomalies['date_week'], anomalies['meeting_count'], color='red', s=50, label='Anomalies', zorder=5)
plt.title('Parliamentary Activity with Anomaly Detection')
plt.xlabel('Date')
plt.ylabel('Weekly Meeting Count')
plt.legend()
plt.xticks(rotation=45)
# Z-score distribution
plt.subplot(3, 2, 2)
plt.hist(z_scores, bins=30, alpha=0.7, edgecolor='black')
plt.axvline(x=anomaly_threshold, color='red', linestyle='--', label=f'Threshold ({anomaly_threshold}Ã)')
plt.title('Z-Score Distribution of Weekly Activity')
plt.xlabel('Z-Score')
plt.ylabel('Frequency')
plt.legend()
# Seasonal pattern analysis
plt.subplot(3, 2, 3)
monthly_pattern = df.groupby('month')['date'].count()
monthly_pattern.plot(kind='bar')
plt.title('Seasonal Activity Pattern')
plt.xlabel('Month')
plt.ylabel('Total Meetings')
plt.xticks(rotation=45)
# Weekly pattern analysis
plt.subplot(3, 2, 4)
weekday_pattern = df.groupby('weekday')['date'].count()
weekday_pattern.plot(kind='bar', color='orange')
plt.title('Weekly Activity Pattern')
plt.xlabel('Day of Week (0=Monday)')
plt.ylabel('Total Meetings')
plt.xticks(rotation=0)
# Change detection visualization
plt.subplot(3, 2, 5)
plt.plot(weekly_activity['date_week'], weekly_activity['change_indicator'], alpha=0.7)
plt.axhline(y=2.0, color='red', linestyle='--', label='Change Threshold')
plt.scatter(significant_changes['date_week'], significant_changes['change_indicator'],
color='red', s=30, label='Significant Changes')
plt.title('Activity Change Detection')
plt.xlabel('Date')
plt.ylabel('Change Magnitude (Ã)')
plt.legend()
plt.xticks(rotation=45)
# Anomaly summary
plt.subplot(3, 2, 6)
anomaly_summary = f"""
Anomaly Detection Results:
" Total Weeks Analyzed: {len(weekly_activity)}
" Statistical Anomalies: {len(anomalies)}
" Significant Changes: {len(significant_changes)}
" Average Weekly Meetings: {weekly_activity['meeting_count'].mean():.1f}
" Activity Standard Deviation: {weekly_activity['meeting_count'].std():.1f}
Anomaly Periods:
{anomalies[['year', 'week', 'meeting_count']].to_string(index=False) if not anomalies.empty else 'None detected'}
"""
plt.text(0.05, 0.95, anomaly_summary, fontsize=9, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8),
transform=plt.gca().transAxes)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Anomaly Summary')
plt.tight_layout()
plt.show()
return weekly_activity, anomalies, significant_changes
# Detect anomalies
activity_data, anomalies, changes = detect_parliamentary_anomalies()
2. Voting Pattern Change Detection¶
Identify significant shifts in voting patterns that may indicate political realignments:
def detect_voting_pattern_changes():
"""Detect significant changes in voting patterns over time"""
# Get voting data with party information
url = "https://oda.ft.dk/api/Afstemning"
params = {
'$expand': 'Stemme/Aktør',
'$filter': "year(opdateringsdato) ge 2018",
'$orderby': 'opdateringsdato asc',
'$top': 1000
}
response = requests.get(url, params=params)
data = response.json()
# Process voting data
voting_data = []
for vote_session in data['value']:
session_date = datetime.fromisoformat(vote_session['opdateringsdato'].replace('Z', '+00:00'))
session_result = vote_session.get('vedtaget', None)
if 'Stemme' in vote_session and vote_session['Stemme']:
party_votes = {}
for vote in vote_session['Stemme']:
if 'Aktør' in vote and vote['Aktør']:
party = vote['Aktör'].get('gruppenavnkort', 'Unknown')
if party not in party_votes:
party_votes[party] = {'for': 0, 'against': 0, 'abstain': 0}
# Map vote types (assuming 1=for, 2=against, 3=abstain)
vote_type = vote.get('typeid', 0)
if vote_type == 1:
party_votes[party]['for'] += 1
elif vote_type == 2:
party_votes[party]['against'] += 1
else:
party_votes[party]['abstain'] += 1
# Calculate party agreement with session result
for party, votes in party_votes.items():
total_votes = sum(votes.values())
if total_votes > 0:
agreement_rate = votes['for'] / total_votes if session_result else votes['against'] / total_votes
voting_data.append({
'session_id': vote_session['id'],
'date': session_date,
'party': party,
'agreement_rate': agreement_rate,
'total_votes': total_votes,
'session_result': session_result
})
df = pd.DataFrame(voting_data)
if not df.empty:
# Calculate rolling agreement rates
df = df.sort_values('date')
df['month'] = df['date'].dt.to_period('M')
# Monthly party agreement rates
monthly_agreement = df.groupby(['month', 'party'])['agreement_rate'].mean().unstack(fill_value=0)
# Detect significant changes in party positions
change_detection = {}
for party in monthly_agreement.columns:
party_series = monthly_agreement[party]
# Calculate rolling correlation with itself (lagged)
if len(party_series) > 12: # Need enough data points
rolling_corr = party_series.rolling(window=6).corr(party_series.shift(6))
significant_changes = rolling_corr[rolling_corr < 0.5] # Low correlation indicates change
if not significant_changes.empty:
change_detection[party] = significant_changes
# Visualization
plt.figure(figsize=(18, 12))
# Party agreement trends
plt.subplot(2, 3, 1)
for party in monthly_agreement.columns[:8]: # Top 8 parties
if monthly_agreement[party].sum() > 0: # Only parties with data
monthly_agreement[party].plot(label=party, alpha=0.8)
plt.title('Party Agreement Rate Trends')
plt.xlabel('Month')
plt.ylabel('Agreement Rate')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=45)
# Agreement rate distribution
plt.subplot(2, 3, 2)
plt.hist(df['agreement_rate'], bins=30, alpha=0.7, edgecolor='black')
plt.title('Distribution of Party Agreement Rates')
plt.xlabel('Agreement Rate')
plt.ylabel('Frequency')
# Party cohesion analysis
plt.subplot(2, 3, 3)
party_cohesion = df.groupby('party')['agreement_rate'].agg(['mean', 'std']).sort_values('mean', ascending=False)
party_cohesion = party_cohesion.head(10) # Top 10 parties
plt.errorbar(range(len(party_cohesion)), party_cohesion['mean'],
yerr=party_cohesion['std'], fmt='o', capsize=5)
plt.xticks(range(len(party_cohesion)), party_cohesion.index, rotation=45)
plt.title('Party Voting Cohesion')
plt.ylabel('Mean Agreement Rate ± Std Dev')
# Temporal agreement heatmap
plt.subplot(2, 3, 4)
# Sample a subset for readability
heatmap_data = monthly_agreement.iloc[-24:, :8] if len(monthly_agreement) > 24 else monthly_agreement.iloc[:, :8]
sns.heatmap(heatmap_data.T, cmap='RdBu_r', center=0.5, annot=False)
plt.title('Party Agreement Heatmap (Recent 24 months)')
plt.xlabel('Month')
plt.ylabel('Party')
# Change detection results
plt.subplot(2, 3, 5)
if change_detection:
change_summary = f"Significant Pattern Changes Detected:\n\n"
for party, changes in list(change_detection.items())[:5]: # Top 5
change_summary += f"" {party}: {len(changes)} periods\n"
change_summary += f"\nTotal parties with changes: {len(change_detection)}"
else:
change_summary = "No significant pattern changes detected in the analyzed period."
plt.text(0.05, 0.95, change_summary, fontsize=10, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='lightcoral', alpha=0.8),
transform=plt.gca().transAxes)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Pattern Change Detection')
# Voting participation trends
plt.subplot(2, 3, 6)
monthly_participation = df.groupby('month')['total_votes'].sum()
monthly_participation.plot(kind='line', marker='o', color='green')
plt.title('Monthly Voting Participation')
plt.xlabel('Month')
plt.ylabel('Total Votes Cast')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
return df, monthly_agreement, change_detection
return None, None, None
# Detect voting pattern changes
voting_df, agreement_trends, pattern_changes = detect_voting_pattern_changes()
Time Series Analysis and Forecasting¶
1. Parliamentary Activity Forecasting¶
Use time series analysis to predict future parliamentary activity patterns:
from sklearn.metrics import mean_absolute_error, mean_squared_error
from scipy import signal
def forecast_parliamentary_activity():
"""Forecast future parliamentary activity using time series analysis"""
# Get historical meeting data
url = "https://oda.ft.dk/api/Møde"
params = {
'$filter': "dato ge datetime'2016-01-01'",
'$select': 'id,dato,titel',
'$orderby': 'dato asc',
'$top': 5000
}
response = requests.get(url, params=params)
data = response.json()
# Process data for time series analysis
meetings = []
for meeting in data['value']:
if meeting.get('dato'):
meeting_date = datetime.fromisoformat(meeting['dato'].replace('Z', '+00:00'))
meetings.append(meeting_date)
# Create time series
ts_df = pd.DataFrame(meetings, columns=['date'])
ts_df['date'] = pd.to_datetime(ts_df['date'])
# Aggregate to weekly counts
weekly_counts = ts_df.set_index('date').resample('W').size()
weekly_counts = weekly_counts.fillna(0)
# Time series analysis
plt.figure(figsize=(18, 15))
# Original time series
plt.subplot(3, 3, 1)
weekly_counts.plot()
plt.title('Weekly Parliamentary Meeting Counts')
plt.ylabel('Number of Meetings')
plt.xlabel('Date')
# Seasonal decomposition (approximate)
# Calculate trend using moving average
trend_window = 52 # 1 year
trend = weekly_counts.rolling(window=trend_window, center=True).mean()
detrended = weekly_counts - trend
plt.subplot(3, 3, 2)
trend.plot(label='Trend', color='red')
weekly_counts.plot(alpha=0.5, label='Original')
plt.title('Trend Analysis')
plt.ylabel('Meetings')
plt.legend()
# Seasonal pattern
plt.subplot(3, 3, 3)
seasonal_pattern = detrended.groupby(detrended.index.week).mean()
seasonal_pattern.plot(kind='bar')
plt.title('Seasonal Pattern (by Week of Year)')
plt.xlabel('Week of Year')
plt.ylabel('Deviation from Trend')
plt.xticks(rotation=45)
# Autocorrelation analysis
plt.subplot(3, 3, 4)
autocorr = [weekly_counts.autocorr(lag=i) for i in range(1, 53)]
plt.plot(range(1, 53), autocorr)
plt.title('Autocorrelation Function')
plt.xlabel('Lag (weeks)')
plt.ylabel('Autocorrelation')
plt.grid(True)
# Spectral analysis (frequency domain)
plt.subplot(3, 3, 5)
freqs, psd = signal.periodogram(weekly_counts.fillna(weekly_counts.mean()))
plt.semilogy(freqs, psd)
plt.title('Power Spectral Density')
plt.xlabel('Frequency')
plt.ylabel('Power')
# Simple forecasting using seasonal naive method
# Use last year's pattern for forecasting
last_year_data = weekly_counts.tail(52)
forecast_periods = 26 # 6 months ahead
# Seasonal naive forecast (repeat last year's pattern)
forecast = []
for i in range(forecast_periods):
seasonal_index = i % 52
if seasonal_index < len(last_year_data):
forecast.append(last_year_data.iloc[seasonal_index])
else:
forecast.append(last_year_data.mean())
# Create forecast dates
last_date = weekly_counts.index[-1]
forecast_dates = pd.date_range(start=last_date + pd.Timedelta(weeks=1),
periods=forecast_periods, freq='W')
forecast_series = pd.Series(forecast, index=forecast_dates)
plt.subplot(3, 3, 6)
# Plot last 2 years + forecast
recent_data = weekly_counts.tail(104)
recent_data.plot(label='Historical', color='blue')
forecast_series.plot(label='Forecast', color='red', linestyle='--')
plt.title('Parliamentary Activity Forecast (6 months)')
plt.ylabel('Meetings per Week')
plt.legend()
# Forecast accuracy assessment (using historical data)
# Split data for validation
split_point = len(weekly_counts) - 26
train_data = weekly_counts[:split_point]
test_data = weekly_counts[split_point:]
# Generate historical forecast for validation
seasonal_forecast = []
for i in range(len(test_data)):
seasonal_index = (split_point + i) % 52
if seasonal_index < len(train_data.tail(52)):
seasonal_forecast.append(train_data.tail(52).iloc[seasonal_index])
else:
seasonal_forecast.append(train_data.tail(52).mean())
# Calculate forecast errors
mae = mean_absolute_error(test_data, seasonal_forecast)
mse = mean_squared_error(test_data, seasonal_forecast)
rmse = np.sqrt(mse)
plt.subplot(3, 3, 7)
test_data.plot(label='Actual', color='green')
plt.plot(test_data.index, seasonal_forecast, label='Forecast', color='red', linestyle='--')
plt.title('Forecast Validation')
plt.ylabel('Meetings per Week')
plt.legend()
# Forecast statistics
plt.subplot(3, 3, 8)
stats_text = f"""
Forecast Performance Metrics:
" Mean Absolute Error: {mae:.2f}
" Root Mean Square Error: {rmse:.2f}
" Mean Weekly Meetings: {weekly_counts.mean():.2f}
" Relative Error: {(mae/weekly_counts.mean())*100:.1f}%
Forecast Summary (Next 6 months):
" Predicted Average: {forecast_series.mean():.2f} meetings/week
" Predicted Range: {forecast_series.min():.1f} - {forecast_series.max():.1f}
" Peak Activity Week: Week {forecast_series.idxmax().week}
" Low Activity Week: Week {forecast_series.idxmin().week}
"""
plt.text(0.05, 0.95, stats_text, fontsize=9, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='lightgreen', alpha=0.8),
transform=plt.gca().transAxes)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Forecast Statistics')
# Residual analysis
plt.subplot(3, 3, 9)
residuals = test_data - seasonal_forecast
plt.hist(residuals, bins=15, alpha=0.7, edgecolor='black')
plt.title('Forecast Residuals Distribution')
plt.xlabel('Residual (Actual - Forecast)')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()
return weekly_counts, forecast_series, {'mae': mae, 'rmse': rmse}
# Generate forecast
activity_ts, forecast, metrics = forecast_parliamentary_activity()
2. Legislative Throughput Prediction¶
Predict legislative throughput based on historical patterns:
def predict_legislative_throughput():
"""Predict legislative throughput using historical case data"""
# Get case data with decision dates
url = "https://oda.ft.dk/api/Sag"
params = {
'$filter': "afgørelsesdato ge datetime'2017-01-01'",
'$select': 'id,titel,afgørelsesdato,opdateringsdato,periodeid,kategoriid',
'$orderby': 'afgørelsesdato asc',
'$top': 2000
}
response = requests.get(url, params=params)
data = response.json()
# Process legislative throughput data
decisions = []
for case in data['value']:
if case.get('afgørelsesdato'):
decision_date = datetime.fromisoformat(case['afgørelsesdato'].replace('Z', '+00:00'))
decisions.append({
'date': decision_date,
'case_id': case['id'],
'category': case.get('kategoriid'),
'period': case.get('periodeid')
})
df = pd.DataFrame(decisions)
if not df.empty:
# Create monthly throughput time series
df['month'] = df['date'].dt.to_period('M')
monthly_throughput = df.groupby('month').size()
# Throughput by category
category_throughput = df.groupby(['month', 'category']).size().unstack(fill_value=0)
# Time series modeling and prediction
plt.figure(figsize=(16, 12))
# Historical throughput
plt.subplot(3, 2, 1)
monthly_throughput.plot()
plt.title('Monthly Legislative Throughput')
plt.ylabel('Cases Decided')
plt.xlabel('Month')
# Throughput by category
plt.subplot(3, 2, 2)
# Show top 5 categories
top_categories = df['category'].value_counts().head(5).index
for category in top_categories:
if category in category_throughput.columns:
category_throughput[category].plot(label=f'Category {category}', alpha=0.8)
plt.title('Throughput by Case Category')
plt.ylabel('Cases Decided')
plt.legend()
# Seasonal analysis
plt.subplot(3, 2, 3)
df['month_num'] = df['date'].dt.month
seasonal_throughput = df.groupby('month_num').size()
seasonal_throughput.plot(kind='bar')
plt.title('Seasonal Throughput Pattern')
plt.xlabel('Month of Year')
plt.ylabel('Average Cases Decided')
plt.xticks(rotation=45)
# Throughput prediction using linear trend
monthly_throughput_numeric = monthly_throughput.reset_index()
monthly_throughput_numeric['month_numeric'] = range(len(monthly_throughput_numeric))
# Fit linear trend
from sklearn.linear_model import LinearRegression
X = monthly_throughput_numeric[['month_numeric']]
y = monthly_throughput_numeric[0] # throughput values
model = LinearRegression()
model.fit(X, y)
# Predict next 12 months
future_months = range(len(monthly_throughput_numeric), len(monthly_throughput_numeric) + 12)
future_predictions = model.predict([[month] for month in future_months])
# Create future dates
last_period = monthly_throughput.index[-1]
future_periods = [last_period + i for i in range(1, 13)]
plt.subplot(3, 2, 4)
monthly_throughput.plot(label='Historical')
plt.plot(future_periods, future_predictions, 'r--', label='Predicted', marker='o')
plt.title('Legislative Throughput Prediction')
plt.ylabel('Cases Decided per Month')
plt.legend()
# Performance analysis
plt.subplot(3, 2, 5)
# Calculate throughput efficiency (decisions per day)
df['weekday'] = df['date'].dt.weekday
weekday_throughput = df.groupby('weekday').size()
weekday_names = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
plt.bar(weekday_names, weekday_throughput.values)
plt.title('Throughput by Day of Week')
plt.ylabel('Total Cases Decided')
plt.xticks(rotation=45)
# Prediction summary
plt.subplot(3, 2, 6)
current_avg = monthly_throughput.mean()
predicted_avg = np.mean(future_predictions)
trend_slope = model.coef_[0]
summary_text = f"""
Legislative Throughput Analysis:
Historical Performance:
" Average Monthly Decisions: {current_avg:.1f}
" Total Cases Analyzed: {len(df)}
" Active Period: {df['date'].min().strftime('%Y-%m')} to {df['date'].max().strftime('%Y-%m')}
Trend Analysis:
" Monthly Trend: {trend_slope:+.2f} cases/month
" Trend Direction: {'Increasing' if trend_slope > 0 else 'Decreasing'}
12-Month Forecast:
" Predicted Avg: {predicted_avg:.1f} cases/month
" Expected Range: {min(future_predictions):.0f} - {max(future_predictions):.0f}
" Total Predicted: {sum(future_predictions):.0f} cases
"""
plt.text(0.05, 0.95, summary_text, fontsize=9, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8),
transform=plt.gca().transAxes)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Throughput Forecast Summary')
plt.tight_layout()
plt.show()
return monthly_throughput, future_predictions, {
'historical_avg': current_avg,
'predicted_avg': predicted_avg,
'trend': trend_slope
}
return None, None, None
# Predict throughput
throughput_ts, predictions, throughput_metrics = predict_legislative_throughput()
Temporal Correlation Analysis¶
1. Cross-Entity Temporal Correlations¶
Analyze how activities in different parliamentary entities correlate over time:
def analyze_temporal_correlations():
"""Analyze temporal correlations between different parliamentary activities"""
# Collect data from multiple entities
entities_data = {}
# Meeting activity
meetings_url = "https://oda.ft.dk/api/Møde"
meetings_params = {
'$filter': "dato ge datetime'2018-01-01'",
'$select': 'id,dato',
'$top': 2000
}
meetings_response = requests.get(meetings_url, params=meetings_params)
if meetings_response.status_code == 200:
meetings_data = meetings_response.json()
meetings_dates = [datetime.fromisoformat(m['dato'].replace('Z', '+00:00'))
for m in meetings_data['value'] if m.get('dato')]
entities_data['meetings'] = meetings_dates
# Case decisions
cases_url = "https://oda.ft.dk/api/Sag"
cases_params = {
'$filter': "afgørelsesdato ge datetime'2018-01-01'",
'$select': 'id,afgørelsesdato',
'$top': 2000
}
cases_response = requests.get(cases_url, params=cases_params)
if cases_response.status_code == 200:
cases_data = cases_response.json()
cases_dates = [datetime.fromisoformat(c['afgørelsesdato'].replace('Z', '+00:00'))
for c in cases_data['value'] if c.get('afgörelsesdato')]
entities_data['decisions'] = cases_dates
# Document publication
docs_url = "https://oda.ft.dk/api/Dokument"
docs_params = {
'$filter': "dato ge datetime'2018-01-01'",
'$select': 'id,dato',
'$top': 2000
}
docs_response = requests.get(docs_url, params=docs_params)
if docs_response.status_code == 200:
docs_data = docs_response.json()
docs_dates = [datetime.fromisoformat(d['dato'].replace('Z', '+00:00'))
for d in docs_data['value'] if d.get('dato')]
entities_data['documents'] = docs_dates
# Convert to weekly time series for correlation analysis
weekly_series = {}
for entity, dates in entities_data.items():
if dates:
# Create DataFrame and aggregate to weekly
df = pd.DataFrame(dates, columns=['date'])
df['date'] = pd.to_datetime(df['date'])
weekly_counts = df.set_index('date').resample('W').size()
# Ensure all series have the same date range
if not weekly_series:
date_range = weekly_counts.index
else:
# Find common date range
date_range = weekly_counts.index.intersection(date_range)
weekly_series[entity] = weekly_counts
# Align all series to common date range
aligned_series = {}
for entity, series in weekly_series.items():
aligned_series[entity] = series.reindex(date_range).fillna(0)
# Create correlation matrix
if len(aligned_series) > 1:
correlation_df = pd.DataFrame(aligned_series)
correlation_matrix = correlation_df.corr()
# Lag correlation analysis
max_lag = 8 # 8 weeks
lag_correlations = {}
entity_names = list(aligned_series.keys())
for i, entity1 in enumerate(entity_names):
for j, entity2 in enumerate(entity_names):
if i < j: # Avoid duplicate pairs
lag_corr = []
for lag in range(-max_lag, max_lag + 1):
if lag == 0:
corr = aligned_series[entity1].corr(aligned_series[entity2])
elif lag > 0:
corr = aligned_series[entity1].shift(-lag).corr(aligned_series[entity2])
else:
corr = aligned_series[entity1].corr(aligned_series[entity2].shift(lag))
lag_corr.append(corr)
lag_correlations[f"{entity1}_vs_{entity2}"] = lag_corr
# Visualization
plt.figure(figsize=(18, 12))
# Time series plot
plt.subplot(2, 3, 1)
for entity, series in aligned_series.items():
series.plot(label=entity.capitalize(), alpha=0.8)
plt.title('Parliamentary Activity Time Series')
plt.ylabel('Weekly Activity Count')
plt.legend()
plt.xticks(rotation=45)
# Correlation heatmap
plt.subplot(2, 3, 2)
sns.heatmap(correlation_matrix, annot=True, cmap='RdBu_r', center=0,
square=True, linewidths=0.5)
plt.title('Cross-Activity Correlation Matrix')
# Lag correlation analysis
plt.subplot(2, 3, 3)
for pair, lag_corr in lag_correlations.items():
plt.plot(range(-max_lag, max_lag + 1), lag_corr,
label=pair.replace('_vs_', ' vs '), marker='o')
plt.title('Lag Correlation Analysis')
plt.xlabel('Lag (weeks)')
plt.ylabel('Correlation')
plt.axhline(y=0, color='black', linestyle='--', alpha=0.5)
plt.axvline(x=0, color='red', linestyle='--', alpha=0.5)
plt.legend()
plt.grid(True, alpha=0.3)
# Activity distribution comparison
plt.subplot(2, 3, 4)
for entity, series in aligned_series.items():
plt.hist(series, alpha=0.6, label=entity.capitalize(), bins=20)
plt.title('Activity Distribution Comparison')
plt.xlabel('Weekly Activity Count')
plt.ylabel('Frequency')
plt.legend()
# Cross-correlation scatter plots
if len(aligned_series) >= 2:
entities = list(aligned_series.keys())
plt.subplot(2, 3, 5)
plt.scatter(aligned_series[entities[0]], aligned_series[entities[1]], alpha=0.6)
plt.xlabel(f'{entities[0].capitalize()} Activity')
plt.ylabel(f'{entities[1].capitalize()} Activity')
plt.title(f'Cross-Activity Scatter Plot')
# Add correlation coefficient
corr_coef = aligned_series[entities[0]].corr(aligned_series[entities[1]])
plt.text(0.05, 0.95, f'r = {corr_coef:.3f}',
transform=plt.gca().transAxes,
bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))
# Summary statistics
plt.subplot(2, 3, 6)
summary_stats = correlation_df.describe()
summary_text = f"""
Temporal Correlation Analysis Summary:
Data Coverage:
" Period: {date_range.min().strftime('%Y-%m-%d')} to {date_range.max().strftime('%Y-%m-%d')}
" Duration: {len(date_range)} weeks
" Entities: {', '.join(aligned_series.keys())}
Key Correlations:
"""
# Add strongest correlations
for i in range(len(correlation_matrix)):
for j in range(i+1, len(correlation_matrix)):
corr_val = correlation_matrix.iloc[i, j]
entity1 = correlation_matrix.index[i]
entity2 = correlation_matrix.columns[j]
summary_text += f"" {entity1} vs {entity2}: {corr_val:.3f}\n"
# Add lag correlation insights
if lag_correlations:
max_lag_corr = max([max(abs(np.array(corr))) for corr in lag_correlations.values()])
summary_text += f"\nStrongest Lag Correlation: {max_lag_corr:.3f}"
plt.text(0.05, 0.95, summary_text, fontsize=9, verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='lightyellow', alpha=0.8),
transform=plt.gca().transAxes)
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.axis('off')
plt.title('Correlation Summary')
plt.tight_layout()
plt.show()
return correlation_df, correlation_matrix, lag_correlations
return None, None, None
# Analyze temporal correlations
activity_df, corr_matrix, lag_corr = analyze_temporal_correlations()
Best Practices and Implementation Guidelines¶
1. Performance Optimization¶
When conducting timeline analysis with large datasets:
# Efficient data retrieval strategies
def optimized_timeline_queries():
"""Best practices for efficient timeline data retrieval"""
# 1. Use selective field retrieval
efficient_params = {
'$select': 'id,opdateringsdato,periodeid', # Only needed fields
'$filter': "year(opdateringsdato) ge 2020", # Targeted time range
'$orderby': 'opdateringsdato asc', # Consistent ordering
'$top': 1000 # Manageable batch size
}
# 2. Implement pagination for large datasets
def paginated_timeline_data(entity, start_year=2020):
all_data = []
skip = 0
batch_size = 1000
while True:
params = {
'$select': 'id,opdateringsdato,periodeid',
'$filter': f"year(opdateringsdato) ge {start_year}",
'$orderby': 'opdateringsdato asc',
'$top': batch_size,
'$skip': skip
}
response = requests.get(f"https://oda.ft.dk/api/{entity}", params=params)
if response.status_code != 200:
break
data = response.json()
batch = data.get('value', [])
if not batch:
break
all_data.extend(batch)
skip += batch_size
# Avoid overwhelming the API
import time
time.sleep(0.1)
return all_data
# 3. Use caching for repeated analyses
import hashlib
import pickle
import os
def cached_timeline_query(entity, params, cache_dir="timeline_cache"):
"""Cache timeline queries to improve performance"""
if not os.path.exists(cache_dir):
os.makedirs(cache_dir)
# Create cache key from parameters
cache_key = hashlib.md5(f"{entity}_{str(sorted(params.items()))}".encode()).hexdigest()
cache_file = os.path.join(cache_dir, f"{cache_key}.pkl")
# Check if cached data exists and is recent (< 1 day old)
if os.path.exists(cache_file):
cache_age = time.time() - os.path.getmtime(cache_file)
if cache_age < 86400: # 1 day in seconds
with open(cache_file, 'rb') as f:
return pickle.load(f)
# Fetch fresh data
response = requests.get(f"https://oda.ft.dk/api/{entity}", params=params)
if response.status_code == 200:
data = response.json()
# Cache the results
with open(cache_file, 'wb') as f:
pickle.dump(data, f)
return data
return None
# Usage examples
cached_data = cached_timeline_query('Sag', efficient_params)
paginated_data = paginated_timeline_data('Aktør', 2018)
2. Error Handling and Data Quality¶
def robust_timeline_analysis():
"""Implement robust error handling for timeline analysis"""
def safe_date_parsing(date_string):
"""Safely parse date strings with error handling"""
if not date_string:
return None
try:
# Handle different date formats
if 'T' in date_string:
return datetime.fromisoformat(date_string.replace('Z', '+00:00'))
else:
return datetime.fromisoformat(date_string)
except (ValueError, TypeError) as e:
print(f"Date parsing error: {date_string} - {e}")
return None
def validate_timeline_data(df, required_columns=['date'], date_columns=['date']):
"""Validate timeline data quality"""
validation_report = {
'total_rows': len(df),
'missing_dates': 0,
'invalid_dates': 0,
'duplicate_entries': 0,
'date_range': None,
'data_gaps': []
}
# Check for missing required columns
missing_cols = [col for col in required_columns if col not in df.columns]
if missing_cols:
print(f"Missing required columns: {missing_cols}")
return validation_report
# Validate date columns
for col in date_columns:
if col in df.columns:
validation_report['missing_dates'] += df[col].isna().sum()
# Check for valid date range
valid_dates = df[col].dropna()
if not valid_dates.empty:
validation_report['date_range'] = {
'start': valid_dates.min(),
'end': valid_dates.max(),
'span_days': (valid_dates.max() - valid_dates.min()).days
}
# Check for duplicates
validation_report['duplicate_entries'] = df.duplicated().sum()
# Identify data gaps (periods with no activity)
if 'date' in df.columns:
df_sorted = df.sort_values('date')
date_diffs = df_sorted['date'].diff()
large_gaps = date_diffs[date_diffs > pd.Timedelta(days=30)]
validation_report['data_gaps'] = len(large_gaps)
return validation_report
def clean_timeline_data(df):
"""Clean and prepare timeline data"""
cleaned_df = df.copy()
# Remove duplicates
cleaned_df = cleaned_df.drop_duplicates()
# Handle missing dates
cleaned_df = cleaned_df.dropna(subset=['date'])
# Sort by date
cleaned_df = cleaned_df.sort_values('date')
# Reset index
cleaned_df = cleaned_df.reset_index(drop=True)
print(f"Data cleaning completed: {len(df)} {len(cleaned_df)} rows")
return cleaned_df
# Example usage
sample_data = pd.DataFrame({
'date': ['2023-01-01T00:00:00Z', None, '2023-01-03T00:00:00Z', '2023-01-01T00:00:00Z'],
'activity': ['meeting', 'vote', 'decision', 'meeting']
})
sample_data['date'] = sample_data['date'].apply(safe_date_parsing)
validation_report = validate_timeline_data(sample_data)
cleaned_data = clean_timeline_data(sample_data)
Advanced Timeline Visualization¶
Create publication-ready visualizations for timeline analysis:
def create_publication_timeline_viz():
"""Create professional timeline visualizations"""
# Set publication-ready style
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['font.family'] = 'Arial'
plt.rcParams['font.size'] = 10
plt.rcParams['axes.labelsize'] = 12
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['legend.fontsize'] = 10
# Create sample data for demonstration
dates = pd.date_range('2020-01-01', '2024-12-31', freq='W')
activities = np.random.poisson(5, len(dates)) + np.sin(np.arange(len(dates)) * 2 * np.pi / 52) * 2
timeline_df = pd.DataFrame({
'date': dates,
'activity': activities,
'trend': activities.cumsum()
})
# Create comprehensive timeline visualization
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
fig.suptitle('Parliamentary Activity Timeline Analysis', fontsize=16, fontweight='bold')
# Main timeline with trend
ax1 = axes[0, 0]
ax1.plot(timeline_df['date'], timeline_df['activity'], alpha=0.7, color='steelblue', linewidth=1)
ax1_twin = ax1.twinx()
ax1_twin.plot(timeline_df['date'], timeline_df['trend'], color='darkred', linewidth=2, alpha=0.8)
ax1.set_title('Activity Timeline with Cumulative Trend')
ax1.set_ylabel('Weekly Activity', color='steelblue')
ax1_twin.set_ylabel('Cumulative Trend', color='darkred')
ax1.tick_params(axis='x', rotation=45)
# Seasonal decomposition
ax2 = axes[0, 1]
monthly_avg = timeline_df.set_index('date').resample('M')['activity'].mean()
seasonal_pattern = monthly_avg.groupby(monthly_avg.index.month).mean()
ax2.bar(range(1, 13), seasonal_pattern.values, color='lightcoral', alpha=0.8)
ax2.set_title('Seasonal Activity Pattern')
ax2.set_xlabel('Month')
ax2.set_ylabel('Average Activity')
ax2.set_xticks(range(1, 13))
ax2.set_xticklabels(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
# Activity distribution
ax3 = axes[1, 0]
ax3.hist(timeline_df['activity'], bins=20, alpha=0.7, color='mediumseagreen', edgecolor='black')
ax3.axvline(timeline_df['activity'].mean(), color='red', linestyle='--',
label=f'Mean: {timeline_df["activity"].mean():.1f}')
ax3.set_title('Activity Distribution')
ax3.set_xlabel('Weekly Activity Count')
ax3.set_ylabel('Frequency')
ax3.legend()
# Annual comparison
ax4 = axes[1, 1]
annual_activity = timeline_df.set_index('date').resample('Y')['activity'].sum()
bars = ax4.bar(range(len(annual_activity)), annual_activity.values,
color='gold', alpha=0.8, edgecolor='black')
ax4.set_title('Annual Activity Comparison')
ax4.set_xlabel('Year')
ax4.set_ylabel('Total Activity')
ax4.set_xticks(range(len(annual_activity)))
ax4.set_xticklabels([str(year) for year in annual_activity.index.year])
# Add value labels on bars
for bar, value in zip(bars, annual_activity.values):
ax4.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.5,
f'{value:.0f}', ha='center', va='bottom', fontsize=9)
plt.tight_layout()
plt.show()
return timeline_df
# Create publication-quality visualization
demo_timeline = create_publication_timeline_viz()
Conclusion¶
Timeline and temporal analysis of Danish Parliamentary data offers unprecedented insights into the dynamics of democratic governance. The comprehensive coverage spanning 74+ years, combined with real-time updates and detailed entity relationships, enables sophisticated chronological analyses that can inform academic research, journalistic investigation, and civic engagement.
Key capabilities demonstrated include:
- Multi-dimensional temporal analysis across cases, actors, meetings, votes, and documents
- Pattern detection for seasonal rhythms, career trajectories, and political evolution
- Anomaly identification for significant events and behavioral changes
- Predictive modeling for future parliamentary activity and legislative throughput
- Cross-entity correlation analysis revealing interconnected parliamentary functions
The temporal richness of this dataset, combined with proper analytical techniques, transforms raw parliamentary data into actionable insights about democratic processes, political trends, and institutional behavior over time.
Remember to always use UTF-8 encoding when working with Danish text data, implement proper error handling for API requests, and consider caching strategies for large-scale temporal analyses. The OData API's requirement for %24 encoding of dollar signs in URL parameters remains critical for successful queries.
This temporal analysis framework provides the foundation for understanding how Danish democracy has evolved and continues to function in real-time, offering valuable perspectives for researchers, journalists, and citizens engaged in democratic oversight and participation.