排期预测：如何精准预测营销活动的最佳时间窗口与资源分配

引言：营销活动排期预测的重要性

在当今竞争激烈的数字营销环境中，精准预测营销活动的最佳时间窗口和资源分配已成为企业成功的关键因素。排期预测不仅仅是简单地选择一个日期，而是需要综合考虑市场趋势、用户行为、竞争格局、资源约束等多个维度的复杂决策过程。

想象一下，一家电商公司计划在”双11”期间推出促销活动，但面临着多个挑战：如何确定最佳的预热期和爆发期？如何在不同渠道（如社交媒体、搜索引擎、电子邮件）之间分配预算？如何避免与主要竞争对手的促销时间冲突？这些问题的答案直接影响着活动的ROI和整体营销效果。

排期预测的核心价值在于：

最大化投资回报率：通过选择用户活跃度最高的时间窗口，提高转化率
优化资源利用：避免资源浪费在低效时段，确保预算和人力的最优配置
降低竞争冲突：避开竞争对手的高峰期，或在竞争较弱的时段获得优势
提升用户体验：在用户最可能接受营销信息的时间点进行触达

本文将系统性地介绍如何构建一个科学的营销活动排期预测体系，包括数据收集、模型构建、时间窗口选择和资源分配策略，并提供完整的代码示例和实战案例。

一、数据基础：构建排期预测的数据体系

1.1 核心数据维度

要进行精准的排期预测，首先需要建立全面的数据收集体系。以下是必须收集的核心数据维度：

用户行为数据：

历史转化数据：点击率、转化率、购买率等
用户活跃时间分布：每日/每周/每月的活跃高峰时段
用户生命周期阶段：新用户、活跃用户、沉睡用户的响应时间差异
用户偏好数据：不同用户群体对不同营销渠道的响应时间

市场环境数据：

行业季节性趋势：节假日、促销季、淡旺季等
竞争对手活动：竞品的促销时间、频率和强度
宏观经济因素：经济周期、政策变化等
社交媒体热点：话题热度、舆论趋势

资源约束数据：

预算约束：总预算、各渠道预算分配
人力约束：运营团队、客服团队的承载能力
物流约束：库存水平、配送能力
技术约束：系统承载能力、服务器资源

1.2 数据收集与预处理示例

以下是一个Python示例，展示如何收集和预处理营销数据：

import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import seaborn as sns

class MarketingDataCollector:
    def __init__(self):
        self.data = pd.DataFrame()
        
    def generate_synthetic_data(self, days=365):
        """
        生成模拟的营销数据，用于演示排期预测
        """
        np.random.seed(42)
        
        # 生成日期序列
        dates = pd.date_range(start='2023-01-01', periods=days, freq='D')
        
        # 基础流量（带有季节性和趋势）
        base_traffic = 1000 + np.sin(np.arange(days) * 2 * np.pi / 365) * 200
        trend = np.arange(days) * 0.5
        weekly_pattern = np.where(dates.dayofweek < 5, 100, -50)  # 工作日流量高
        
        # 添加随机噪声
        noise = np.random.normal(0, 50, days)
        
        # 总流量
        traffic = base_traffic + trend + weekly_pattern + noise
        
        # 转化率（与流量负相关，模拟竞争效应）
        base_conversion = 0.03
        conversion_rate = base_conversion - (traffic - 1000) * 0.00001 + np.random.normal(0, 0.002, days)
        conversion_rate = np.clip(conversion_rate, 0.01, 0.05)
        
        # 转化量
        conversions = traffic * conversion_rate
        
        # 竞争强度（模拟竞争对手活动）
        competitor_activity = np.random.poisson(0.3, days)
        
        # 节假日标记
        holidays = [
            '2023-01-01', '2023-01-22', '2023-05-01', '2023-06-18',
            '2023-09-10', '2023-10-01', '2023-11-11', '2023-12-12'
        ]
        is_holiday = dates.isin(pd.to_datetime(holidays)).astype(int)
        
        # 构建DataFrame
        df = pd.DataFrame({
            'date': dates,
            'traffic': traffic.astype(int),
            'conversion_rate': conversion_rate,
            'conversions': conversions.astype(int),
            'competitor_activity': competitor_activity,
            'is_holiday': is_holiday,
            'day_of_week': dates.dayofweek,
            'month': dates.month,
            'is_weekend': (dates.dayofweek >= 5).astype(int)
        })
        
        # 添加用户分层数据
        df['new_user_ratio'] = np.random.beta(2, 5, days)  # 新用户比例
        df['active_user_ratio'] = np.random.beta(5, 2, days)  # 活跃用户比例
        
        return df
    
    def calculate_time_windows(self, df, window_size=7):
        """
        计算滑动时间窗口的统计特征
        """
        df_sorted = df.sort_values('date')
        
        # 计算滚动统计
        df_sorted['rolling_traffic_mean'] = df_sorted['traffic'].rolling(window=window_size).mean()
        df_sorted['rolling_conversion_mean'] = df_sorted['conversion_rate'].rolling(window=window_size).mean()
        df_sorted['rolling_competitor_mean'] = df_sorted['competitor_activity'].rolling(window=window_size).mean()
        
        # 计算增长率
        df_sorted['traffic_growth_rate'] = df_sorted['traffic'].pct_change()
        df_sorted['conversion_growth_rate'] = df_sorted['conversion_rate'].pct_change()
        
        return df_sorted

# 使用示例
collector = MarketingDataCollector()
marketing_data = collector.generate_synthetic_data(365)
processed_data = collector.calculate_time_windows(marketing_data)

print("数据预览:")
print(processed_data.head())
print(f"\n数据集形状: {processed_data.shape}")
print(f"\n基本统计:")
print(processed_data[['traffic', 'conversion_rate', 'conversions']].describe())

这个代码生成了一个包含365天营销数据的模拟数据集，包括流量、转化率、竞争强度等关键指标，并计算了滑动窗口的统计特征，为后续的排期预测打下数据基础。

2. 时间窗口预测模型

2.1 预测模型架构

时间窗口预测的核心是识别最佳的营销时机。我们需要构建一个多维度的预测模型，综合考虑以下因素：

用户响应概率：基于历史数据预测用户在特定时间的响应可能性
竞争强度：评估竞争对手在同一时间段的活动强度
资源可用性：确保在选定时间内有足够的资源支持
成本效益：平衡投入成本与预期收益

2.2 机器学习预测模型实现

以下是一个完整的预测模型实现，使用随机森林和XGBoost进行时间窗口预测：

from sklearn.model_selection import train_test_split, TimeSeriesSplit
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.preprocessing import StandardScaler
import xgboost as xgb
import warnings
warnings.filterwarnings('ignore')

class TimeWindowPredictor:
    def __init__(self):
        self.models = {}
        self.scaler = StandardScaler()
        self.feature_importance = {}
        
    def prepare_features(self, df):
        """
        准备预测特征
        """
        features = df.copy()
        
        # 时间特征
        features['day_of_week_sin'] = np.sin(2 * np.pi * features['day_of_week'] / 7)
        features['day_of_week_cos'] = np.cos(2 * np.pi * features['day_of_week'] / 7)
        features['month_sin'] = np.sin(2 * np.pi * features['month'] / 12)
        features['month_cos'] = np.cos(2 * np.pi * features['month'] / 12)
        
        # 滞后特征
        for lag in [1, 7, 14]:
            features[f'traffic_lag_{lag}'] = features['traffic'].shift(lag)
            features[f'conversion_lag_{lag}'] = features['conversion_rate'].shift(lag)
        
        # 滚动特征
        features['traffic_rolling_7'] = features['traffic'].rolling(7).mean()
        features['traffic_rolling_30'] = features['traffic'].rolling(30).mean()
        
        # 填充缺失值
        features = features.fillna(method='bfill').fillna(method='ffill')
        
        # 定义特征列
        feature_cols = [
            'day_of_week_sin', 'day_of_week_cos', 'month_sin', 'month_cos',
            'is_holiday', 'is_weekend', 'competitor_activity',
            'traffic_lag_1', 'traffic_lag_7', 'traffic_lag_14',
            'conversion_lag_1', 'conversion_lag_7', 'conversion_lag_14',
            'traffic_rolling_7', 'traffic_rolling_30',
            'new_user_ratio', 'active_user_ratio'
        ]
        
        return features[feature_cols], features['conversions']
    
    def train_models(self, X, y):
        """
        训练多个预测模型
        """
        # 时间序列分割（保持时间顺序）
        tscv = TimeSeriesSplit(n_splits=5)
        
        # 准备数据
        X_scaled = self.scaler.fit_transform(X)
        
        # 模型1：随机森林
        rf_model = RandomForestRegressor(
            n_estimators=200,
            max_depth=10,
            min_samples_split=5,
            random_state=42,
            n_jobs=-1
        )
        
        # 模型2：XGBoost
        xgb_model = xgb.XGBRegressor(
            n_estimators=200,
            max_depth=6,
            learning_rate=0.1,
            subsample=0.8,
            colsample_bytree=0.8,
            random_state=42,
            n_jobs=-1
        )
        
        # 模型3：梯度提升
        gb_model = GradientBoostingRegressor(
            n_estimators=200,
            max_depth=6,
            learning_rate=0.1,
            random_state=42
        )
        
        models = {
            'random_forest': rf_model,
            'xgboost': xgb_model,
            'gradient_boosting': gb_model
        }
        
        # 交叉验证训练
        cv_scores = {}
        for name, model in models.items():
            scores = []
            for train_idx, val_idx in tscv.split(X_scaled):
                X_train, X_val = X_scaled[train_idx], X_scaled[val_idx]
                y_train, y_val = y.iloc[train_idx], y.iloc[val_idx]
                
                model.fit(X_train, y_train)
                y_pred = model.predict(X_val)
                score = mean_squared_error(y_val, y_pred, squared=False)
                scores.append(score)
            
            cv_scores[name] = np.mean(scores)
            print(f"{name} CV RMSE: {np.mean(scores):.2f}")
        
        # 选择最佳模型
        best_model_name = min(cv_scores, key=cv_scores.get)
        self.models['best'] = models[best_model_name]
        
        # 在全数据上重新训练最佳模型
        self.models['best'].fit(X_scaled, y)
        
        # 计算特征重要性
        if hasattr(self.models['best'], 'feature_importances_'):
            self.feature_importance = dict(zip(X.columns, self.models['best'].feature_importances_))
        
        return self.models['best']
    
    def predict_time_windows(self, future_dates, df_history):
        """
        预测未来时间窗口的表现
        """
        # 准备未来日期的特征
        future_data = []
        for date in future_dates:
            future_row = {
                'date': date,
                'day_of_week': date.dayofweek,
                'month': date.month,
                'is_holiday': 0,  # 需要根据实际节假日调整
                'is_weekend': 1 if date.dayofweek >= 5 else 0,
                'competitor_activity': 0,  # 需要根据竞品情报调整
                'new_user_ratio': df_history['new_user_ratio'].mean(),
                'active_user_ratio': df_history['active_user_ratio'].mean()
            }
            
            # 添加滞后特征（使用历史数据）
            if len(df_history) >= 14:
                future_row['traffic_lag_1'] = df_history['traffic'].iloc[-1]
                future_row['traffic_lag_7'] = df_history['traffic'].iloc[-7]
                future_row['traffic_lag_14'] = df_history['traffic'].iloc[-14]
                future_row['conversion_lag_1'] = df_history['conversion_rate'].iloc[-1]
                future_row['conversion_lag_7'] = df_history['conversion_rate'].iloc[-7]
                future_row['conversion_lag_14'] = df_history['conversion_rate'].iloc[-14]
                future_row['traffic_rolling_7'] = df_history['traffic'].iloc[-7:].mean()
                future_row['traffic_rolling_30'] = df_history['traffic'].iloc[-30:].mean()
            else:
                # 使用平均值填充
                future_row['traffic_lag_1'] = df_history['traffic'].mean()
                future_row['traffic_lag_7'] = df_history['traffic'].mean()
                future_row['traffic_lag_14'] = df_history['traffic'].mean()
                future_row['conversion_lag_1'] = df_history['conversion_rate'].mean()
                future_row['conversion_lag_7'] = df_history['conversion_rate'].mean()
                future_row['conversion_lag_14'] = df_history['conversion_rate'].mean()
                future_row['traffic_rolling_7'] = df_history['traffic'].mean()
                future_row['traffic_rolling_30'] = df_history['traffic'].mean()
            
            future_data.append(future_row)
        
        future_df = pd.DataFrame(future_data)
        
        # 准备特征
        feature_cols = [
            'day_of_week', 'month', 'is_holiday', 'is_weekend', 'competitor_activity',
            'traffic_lag_1', 'traffic_lag_7', 'traffic_lag_14',
            'conversion_lag_1', 'conversion_lag_7', 'conversion_lag_14',
            'traffic_rolling_7', 'traffic_rolling_30',
            'new_user_ratio', 'active_user_ratio'
        ]
        
        # 添加时间特征
        future_df['day_of_week_sin'] = np.sin(2 * np.pi * future_df['day_of_week'] / 7)
        future_df['day_of_week_cos'] = np.cos(2 * np.pi * future_df['day_of_week'] / 7)
        future_df['month_sin'] = np.sin(2 * np.pi * future_df['month'] / 12)
        future_df['month_cos'] = np.cos(2 * np.pi * future_df['month'] / 12)
        
        feature_cols_transformed = [
            'day_of_week_sin', 'day_of_week_cos', 'month_sin', 'month_cos',
            'is_holiday', 'is_weekend', 'competitor_activity',
            'traffic_lag_1', 'traffic_lag_7', 'traffic_lag_14',
            'conversion_lag_1', 'conversion_lag_7', 'conversion_lag_14',
            'traffic_rolling_7', 'traffic_rolling_30',
            'new_user_ratio', 'active_user_ratio'
        ]
        
        X_future = future_df[feature_cols_transformed]
        X_future_scaled = self.scaler.transform(X_future)
        
        # 预测
        predictions = self.models['best'].predict(X_future_scaled)
        
        # 计算置信区间（基于历史残差）
        if hasattr(self.models['best'], 'predict'):
            # 使用历史数据计算残差标准差
            X_history_scaled = self.scaler.transform(X)
            history_pred = self.models['best'].predict(X_history_scaled)
            residuals = y - history_pred
            residual_std = residuals.std()
            
            # 预测区间
            future_df['prediction'] = predictions
            future_df['lower_bound'] = predictions - 1.96 * residual_std
            future_df['upper_bound'] = predictions + 1.96 * residual_std
            future_df['confidence'] = 1 - (1.96 * residual_std / predictions)
        
        return future_df

# 使用示例
predictor = TimeWindowPredictor()
X, y = predictor.prepare_features(processed_data)
best_model = predictor.train_models(X, y)

# 预测未来30天
future_dates = pd.date_range(start='2024-01-01', periods=30, freq='D')
future_predictions = predictor.predict_time_windows(future_dates, processed_data)

print("\n未来30天预测结果:")
print(future_predictions[['date', 'prediction', 'lower_bound', 'upper_bound', 'confidence']].head(10))

# 特征重要性分析
print("\n特征重要性:")
for feature, importance in sorted(predictor.feature_importance.items(), key=lambda x: x[1], reverse=True)[:5]:
    print(f"  {feature}: {importance:.4f}")

2.3 时间窗口优化算法

基于预测结果，我们需要找到最佳的时间窗口。以下是一个优化算法：

class TimeWindowOptimizer:
    def __init__(self, budget=100000, max_concurrent_campaigns=3):
        self.budget = budget
        self.max_concurrent_campaigns = max_concurrent_campaigns
        
    def find_optimal_windows(self, predictions_df, campaign_duration=7, min_gap=2):
        """
        寻找最佳时间窗口
        """
        # 计算每个日期的得分
        predictions_df['score'] = (
            predictions_df['prediction'] * 0.4 +  # 预测转化量权重
            predictions_df['confidence'] * 0.3 +   # 置信度权重
            (1 - predictions_df['competitor_activity'] / predictions_df['competitor_activity'].max()) * 0.3  # 竞争劣势权重
        )
        
        # 排序
        ranked_days = predictions_df.sort_values('score', ascending=False).reset_index(drop=True)
        
        # 选择不重叠的窗口
        selected_windows = []
        used_dates = set()
        
        for _, row in ranked_days.iterrows():
            start_date = row['date']
            window_dates = pd.date_range(start=start_date, periods=campaign_duration, freq='D')
            
            # 检查是否与已选窗口冲突
            if len(used_dates.intersection(set(window_dates))) == 0:
                # 检查最小间隔
                if not used_dates or all(abs((d - max(used_dates)).days) >= min_gap for d in window_dates):
                    window_score = row['score']
                    estimated_conversions = row['prediction'] * campaign_duration
                    
                    selected_windows.append({
                        'start_date': start_date,
                        'end_date': window_dates[-1],
                        'score': window_score,
                        'estimated_conversions': estimated_conversions,
                        'confidence': row['confidence']
                    })
                    
                    used_dates.update(window_dates)
                    
                    if len(selected_windows) >= self.max_concurrent_campaigns:
                        break
        
        return selected_windows

# 使用示例
optimizer = TimeWindowOptimizer(budget=50000, max_concurrent_campaigns=2)
optimal_windows = optimizer.find_optimal_windows(future_predictions, campaign_duration=7, min_gap=3)

print("\n推荐的最佳时间窗口:")
for i, window in enumerate(optimal_windows, 1):
    print(f"窗口 {i}:")
    print(f"  日期: {window['start_date'].strftime('%Y-%m-%d')} 至 {window['end_date'].strftime('%Y-%m-%d')}")
    print(f"  预估转化: {window['estimated_conversions']:.0f}")
    print(f"  置信度: {window['confidence']:.2%}")
    print(f"  综合得分: {window['score']:.2f}")
    print()

3. 资源分配策略

3.1 资源分配模型

资源分配的核心是在有限预算下最大化整体营销效果。我们需要考虑：

渠道分配：不同渠道的ROI差异
时间分配：不同时段的资源投入强度
用户分层：针对不同用户群体的资源倾斜

3.2 优化算法实现

from scipy.optimize import minimize
import pulp  # 线性规划库

class ResourceAllocator:
    def __init__(self, channels, total_budget):
        self.channels = channels
        self.total_budget = total_budget
        
    def calculate_channel_roi(self, historical_data):
        """
        计算各渠道的历史ROI
        """
        roi_data = {}
        for channel in self.channels:
            channel_data = historical_data[historical_data['channel'] == channel]
            if len(channel_data) > 0:
                cost = channel_data['cost'].sum()
                revenue = channel_data['revenue'].sum()
                roi = (revenue - cost) / cost if cost > 0 else 0
                roi_data[channel] = {
                    'roi': roi,
                    'cost': cost,
                    'revenue': revenue,
                    'conversion_rate': channel_data['conversions'].sum() / channel_data['impressions'].sum()
                }
        return roi_data
    
    def optimize_budget_allocation(self, roi_data, constraints=None):
        """
        使用线性规划优化预算分配
        """
        if constraints is None:
            constraints = {
                'min_budget_per_channel': 1000,
                'max_budget_per_channel': self.total_budget * 0.5,
                'max_channels': len(self.channels)
            }
        
        # 创建优化问题
        prob = pulp.LpProblem("Budget_Allocation", pulp.LpMaximize)
        
        # 决策变量：每个渠道的预算分配
        budget_vars = pulp.LpVariable.dicts(
            "budget", 
            self.channels, 
            lowBound=constraints['min_budget_per_channel'],
            upBound=constraints['max_budget_per_channel'],
            cat='Continuous'
        )
        
        # 目标函数：最大化总预期收益
        # 假设收益 = 预算 * ROI（简化模型）
        prob += pulp.lpSum([budget_vars[ch] * roi_data[ch]['roi'] for ch in self.channels])
        
        # 约束条件：总预算不超过限制
        prob += pulp.lpSum([budget_vars[ch] for ch in self.channels]) <= self.total_budget
        
        # 约束条件：至少选择2个渠道
        # 使用二进制变量表示是否选择该渠道
        channel_selected = pulp.LpVariable.dicts("selected", self.channels, cat='Binary')
        
        for ch in self.channels:
            # 如果预算大于最小值，则必须选择
            prob += budget_vars[ch] >= constraints['min_budget_per_channel'] * channel_selected[ch]
            # 如果不选择，则预算为0
            prob += budget_vars[ch] <= constraints['max_budget_per_channel'] * channel_selected[ch]
        
        # 至少选择2个渠道
        prob += pulp.lpSum([channel_selected[ch] for ch in self.channels]) >= 2
        
        # 求解
        prob.solve(pulp.PULP_CBC_CMD(msg=False))
        
        # 提取结果
        allocation = {}
        for ch in self.channels:
            allocation[ch] = budget_vars[ch].value()
        
        return allocation
    
    def allocate_time_based_budget(self, windows, channel_allocation):
        """
        基于时间窗口分配预算
        """
        total_windows = len(windows)
        if total_windows == 0:
            return {}
        
        # 计算每个窗口的权重（基于预测转化量）
        total_conversions = sum(w['estimated_conversions'] for w in windows)
        
        time_allocation = {}
        for window in windows:
            weight = window['estimated_conversions'] / total_conversions
            window_budget = self.total_budget * weight
            
            # 在窗口内按渠道分配
            window_channels = {}
            for channel, budget in channel_allocation.items():
                # 根据渠道特性调整时间分配
                # 例如：社交媒体在周末效果更好，邮件在工作日更好
                channel_weight = self._get_time_channel_weight(channel, window['start_date'])
                window_channels[channel] = budget * weight * channel_weight
            
            # 归一化到窗口预算
            total_window_channels = sum(window_channels.values())
            if total_window_channels > 0:
                scale = window_budget / total_window_channels
                window_channels = {ch: budget * scale for ch, budget in window_channels.items()}
            
            time_allocation[f"{window['start_date'].strftime('%Y-%m-%d')}_{window['end_date'].strftime('%Y-%m-%d')}"] = window_channels
        
        return time_allocation
    
    def _get_time_channel_weight(self, channel, date):
        """
        根据时间和渠道特性计算权重
        """
        weights = {
            'social': {5: 1.3, 6: 1.3},  # 周末社交媒体权重高
            'email': {0: 1.2, 1: 1.2, 2: 1.2, 3: 1.2, 4: 1.2},  # 工作日邮件权重高
            'search': {0: 1.1, 1: 1.1, 2: 1.1, 3: 1.1, 4: 1.1},  # 工作日搜索权重高
            'display': {5: 1.1, 6: 1.1}  # 周末展示广告权重略高
        }
        
        channel_weights = weights.get(channel, {})
        day_of_week = date.dayofweek
        
        return channel_weights.get(day_of_week, 1.0)

# 使用示例
channels = ['social', 'email', 'search', 'display']
total_budget = 50000

allocator = ResourceAllocator(channels, total_budget)

# 模拟历史数据
np.random.seed(42)
historical_data = pd.DataFrame({
    'channel': np.random.choice(channels, 1000),
    'cost': np.random.uniform(100, 5000, 1000),
    'revenue': np.random.uniform(200, 8000, 1000),
    'conversions': np.random.randint(1, 50, 1000),
    'impressions': np.random.randint(1000, 10000, 1000)
})

# 计算ROI
roi_data = allocator.calculate_channel_roi(historical_data)
print("各渠道ROI:")
for channel, data in roi_data.items():
    print(f"  {channel}: ROI={data['roi']:.2f}, 转化率={data['conversion_rate']:.4f}")

# 优化预算分配
optimal_allocation = allocator.optimize_budget_allocation(roi_data)
print("\n最优预算分配:")
for channel, budget in optimal_allocation.items():
    print(f"  {channel}: ${budget:,.0f} ({budget/total_budget:.1%})")

# 时间窗口预算分配
if optimal_windows:
    time_based_allocation = allocator.allocate_time_based_budget(optimal_windows, optimal_allocation)
    print("\n时间窗口预算分配:")
    for window, channels_budget in time_based_allocation.items():
        print(f"  窗口 {window}:")
        for channel, budget in channels_budget.items():
            print(f"    {channel}: ${budget:,.0f}")

4. 实战案例：电商大促活动排期预测

4.1 案例背景

假设我们是一家电商平台，需要为”双11”大促活动制定排期和资源分配策略。活动周期为11月1日至11月11日，总预算100万元，需要在社交媒体、搜索引擎、电子邮件和展示广告四个渠道分配。

4.2 完整解决方案

class Double11CampaignPlanner:
    def __init__(self, total_budget=1000000):
        self.total_budget = total_budget
        self.data_collector = MarketingDataCollector()
        self.predictor = TimeWindowPredictor()
        self.optimizer = TimeWindowOptimizer(budget=total_budget)
        self.allocator = ResourceAllocator(
            channels=['social', 'email', 'search', 'display'],
            total_budget=total_budget
        )
        
    def plan_campaign(self, historical_days=365, forecast_days=30):
        """
        完整的活动排期规划
        """
        print("=" * 60)
        print("电商双11大促活动排期预测与资源分配")
        print("=" * 60)
        
        # 1. 数据准备
        print("\n[步骤1] 数据准备...")
        historical_data = self.data_collector.generate_synthetic_data(historical_days)
        processed_data = self.data_collector.calculate_time_windows(historical_data)
        
        # 2. 特征工程
        print("\n[步骤2] 特征工程...")
        X, y = self.predictor.prepare_features(processed_data)
        print(f"  特征维度: {X.shape[1]}")
        print(f"  样本数量: {len(X)}")
        
        # 3. 模型训练
        print("\n[步骤3] 模型训练...")
        best_model = self.predictor.train_models(X, y)
        
        # 4. 预测未来时间窗口
        print("\n[步骤4] 预测未来时间窗口...")
        future_dates = pd.date_range(start='2024-11-01', periods=forecast_days, freq='D')
        future_predictions = self.predictor.predict_time_windows(future_dates, processed_data)
        
        # 5. 优化时间窗口
        print("\n[步骤5] 优化时间窗口...")
        optimal_windows = self.optimizer.find_optimal_windows(
            future_predictions, 
            campaign_duration=7,  # 每个促销周期7天
            min_gap=2
        )
        
        print(f"\n推荐的{len(optimal_windows)}个促销窗口:")
        for i, window in enumerate(optimal_windows, 1):
            print(f"  窗口{i}: {window['start_date'].strftime('%m月%d日')} - {window['end_date'].strftime('%m月%d日')}")
            print(f"    预估转化: {window['estimated_conversions']:.0f}")
            print(f"    置信度: {window['confidence']:.1%}")
        
        # 6. 渠道ROI分析
        print("\n[步骤6] 渠道ROI分析...")
        # 模拟历史渠道数据
        np.random.seed(42)
        channel_data = pd.DataFrame({
            'channel': np.random.choice(['social', 'email', 'search', 'display'], 1000),
            'cost': np.random.uniform(500, 8000, 1000),
            'revenue': np.random.uniform(800, 15000, 1000),
            'conversions': np.random.randint(1, 100, 1000),
            'impressions': np.random.randint(1000, 20000, 1000)
        })
        
        roi_data = self.allocator.calculate_channel_roi(channel_data)
        for channel, data in roi_data.items():
            print(f"  {channel}: ROI={data['roi']:.2f}, 转化率={data['conversion_rate']:.3f}")
        
        # 7. 预算分配优化
        print("\n[步骤7] 预算分配优化...")
        channel_allocation = self.allocator.optimize_budget_allocation(roi_data)
        print("\n渠道预算分配:")
        total_allocated = 0
        for channel, budget in channel_allocation.items():
            print(f"  {channel}: ${budget:,.0f} ({budget/self.total_budget:.1%})")
            total_allocated += budget
        
        print(f"  总计: ${total_allocated:,.0f} (预算使用率: {total_allocated/self.total_budget:.1%})")
        
        # 8. 时间-渠道联合分配
        print("\n[步骤8] 时间-渠道联合分配...")
        time_allocation = self.allocator.allocate_time_based_budget(optimal_windows, channel_allocation)
        
        print("\n详细执行计划:")
        for window_key, channels_budget in time_allocation.items():
            window_start, window_end = window_key.split('_')
            print(f"\n  促销窗口: {window_start} 至 {window_end}")
            for channel, budget in channels_budget.items():
                daily_budget = budget / 7  # 7天周期
                print(f"    {channel}: 总预算 ${budget:,.0f}, 日均 ${daily_budget:,.0f}")
        
        # 9. 生成执行建议
        print("\n[步骤9] 执行建议...")
        self._generate_recommendations(optimal_windows, channel_allocation, time_allocation)
        
        return {
            'windows': optimal_windows,
            'channel_allocation': channel_allocation,
            'time_allocation': time_allocation,
            'predictions': future_predictions
        }
    
    def _generate_recommendations(self, windows, channel_allocation, time_allocation):
        """
        生成执行建议
        """
        print("\n" + "="*60)
        print("执行建议")
        print("="*60)
        
        # 1. 时间窗口建议
        print("\n1. 时间窗口策略:")
        if len(windows) >= 2:
            print("   - 采用'预热+爆发'双窗口策略")
            print(f"   - 预热期: {windows[0]['start_date'].strftime('%m月%d日')} - {windows[0]['end_date'].strftime('%m月%d日')}")
            print(f"   - 爆发期: {windows[1]['start_date'].strftime('%m月%d日')} - {windows[1]['end_date'].strftime('%m月%d日')}")
        else:
            print("   - 采用单窗口集中爆发策略")
        
        # 2. 渠道策略
        print("\n2. 渠道策略:")
        sorted_channels = sorted(channel_allocation.items(), key=lambda x: x[1], reverse=True)
        for i, (channel, budget) in enumerate(sorted_channels, 1):
            print(f"   {i}. {channel.upper()}: 预算占比{budget/self.total_budget:.1%}")
            if channel == 'social':
                print("      建议: 重点投放KOL合作、短视频内容")
            elif channel == 'email':
                print("      建议: 分层触达，老客唤醒+新客预热")
            elif channel == 'search':
                print("      建议: 竞价关键词优化，品牌词保护")
            elif channel == 'display':
                print("      建议: 重定向投放，购物车挽回")
        
        # 3. 风险提示
        print("\n3. 风险提示:")
        print("   - 密切监控竞争对手活动，准备应急预案")
        print("   - 预留10-15%预算作为机动资源")
        print("   - 建立实时数据看板，每日复盘调整")
        
        # 4. KPI目标
        print("\n4. KPI目标建议:")
        total_conversions = sum(w['estimated_conversions'] for w in windows)
        print(f"   - 总转化目标: {total_conversions:.0f}")
        print(f"   - 平均转化率: {total_conversions / self.total_budget * 100:.2f}%")
        print(f"   - ROI目标: >2.0")

# 执行完整计划
planner = Double11CampaignPlanner(total_budget=1000000)
campaign_plan = planner.plan_campaign(historical_days=365, forecast_days=30)

5. 高级技巧与最佳实践

5.1 实时调整机制

市场环境是动态变化的，需要建立实时调整机制：

class RealTimeAdjuster:
    def __init__(self, campaign_plan):
        self.original_plan = campaign_plan
        self.performance_data = []
        
    def monitor_performance(self, daily_data):
        """
        监控每日表现
        """
        # 计算实际vs预测的偏差
        for window in self.original_plan['windows']:
            if window['start_date'] <= daily_data['date'] <= window['end_date']:
                predicted = window['estimated_conversions'] / 7  # 日均预测
                actual = daily_data['actual_conversions']
                variance = (actual - predicted) / predicted if predicted > 0 else 0
                
                self.performance_data.append({
                    'date': daily_data['date'],
                    'predicted': predicted,
                    'actual': actual,
                    'variance': variance
                })
                
                return variance
        return 0
    
    def adjust_budget(self, variance_threshold=0.2):
        """
        根据表现调整预算
        """
        if len(self.performance_data) < 3:
            return self.original_plan
        
        recent_variance = np.mean([d['variance'] for d in self.performance_data[-3:]])
        
        if abs(recent_variance) > variance_threshold:
            adjustment_factor = 1 + (recent_variance * 0.5)  # 部分调整
            
            # 调整渠道预算
            adjusted_plan = self.original_plan.copy()
            for channel in adjusted_plan['channel_allocation']:
                adjusted_plan['channel_allocation'][channel] *= adjustment_factor
            
            print(f"预算调整: 因表现偏差{recent_variance:.1%}，调整系数{adjustment_factor:.2f}")
            return adjusted_plan
        
        return self.original_plan

5.2 A/B测试框架

在正式大规模投放前，建议进行小规模A/B测试：

def ab_test_framework():
    """
    A/B测试框架
    """
    test_scenarios = [
        {
            'name': '时间窗口测试',
            'variants': ['早鸟期', '预热期', '爆发期'],
            'metrics': ['转化率', '客单价', 'ROI']
        },
        {
            'name': '渠道组合测试',
            'variants': ['社交+搜索', '邮件+展示', '全渠道'],
            'metrics': ['获客成本', '转化率', '用户质量']
        }
    ]
    
    print("A/B测试建议:")
    for scenario in test_scenarios:
        print(f"\n{scenario['name']}:")
        for variant in scenario['variants']:
            print(f"  - {variant}")
        print(f"  监测指标: {', '.join(scenario['metrics'])}")

6. 总结与工具清单

6.1 关键成功要素

数据质量：确保数据的完整性和准确性
模型选择：根据业务特点选择合适的预测模型
动态调整：建立实时监控和调整机制
跨部门协作：运营、技术、客服等部门的紧密配合

6.2 推荐工具栈

数据处理：Pandas, NumPy
机器学习：Scikit-learn, XGBoost, LightGBM
优化求解：PuLP, SciPy
可视化：Matplotlib, Seaborn, Plotly
监控：Prometheus, Grafana
工作流：Airflow, Prefect

6.3 持续优化建议

定期复盘：每次活动后进行深度复盘，更新模型
特征工程：持续挖掘新的有效特征
模型迭代：尝试更先进的模型（如深度学习、强化学习）
行业对标：关注行业最佳实践，持续优化策略

通过本文介绍的方法论和工具，您可以构建一个科学的营销活动排期预测体系，显著提升营销效果和资源利用效率。记住，最好的系统是能够持续学习和适应的系统，保持对数据的敏感度和对业务的深度理解是成功的关键。