Excel Powerquery tarda aproximadamente 4 horas en actualizarse

Excel Powerquery tarda aproximadamente 4 horas en actualizarse

He escrito una serie de PowerQueries que calculan los empleados que acumulan 12 meses de enfermedad teniendo en cuenta su patrón de turnos (puede que no sea de lunes a viernes) y sus horas contratadas.
Devuelve el número de días de enfermedad, el número de casos de enfermedad y la puntuación de Bradford.
Si alguien está enfermo en cualquier momento del fin de semana, se cuenta como el mismo suceso.

Por el momento, los datos sin procesar están contenidos en dos archivos CSV con 180.908 registros. Crece en unos 3100 registros por semana, lo que representa las horas trabajadas por empleado por día.
Los datos originales tienen 33 columnas, pero solo uso 10. Las otras columnas se eliminan como uno de los primeros pasos de las consultas.

El problemaes que lo ejecuté y tardó unas cuatro horas en actualizarse... demasiado tiempo.
¿Alguien puede ver algún cuello de botella aparte del hecho de que es una "base de datos" de Excel, por lo que no puede utilizar ningún plegado de consultas?
Cualquier ayuda sería muy apreciada.

Un ejemplo de los datos es: ingrese la descripción de la imagen aquí

Crea algunos datos sin procesar:

  • A1:={"Date","Facility","Department","Status","Forename","Surname","Emp No","Contract","Sick","Wkg Patt"}
  • A2: =SEQUENCE(438,,44296)(formato como fecha)
  • B2:=IF(A2#<>"",{"Warehouse","Inbound","Perm","D","BC","1",40})
  • I2: =IF(MID(J2,WEEKDAY(A2,16),1)=0,"",IF(RANDBETWEEN(0,50)<10,RANDBETWEEN(1,8),"")) - y llenar
  • J2: =IF(A2#<>"","0011111 " & REPT(CHAR(RANDBETWEEN(65,80)),RANDBETWEEN(0,5)))- esto representa el patrón de turnos con una serie de 0 y 1 que comienza un sábado. El texto al final muestra que esta columna no fue diseñada originalmente para esto, por lo que es posible que también contenga otro texto.

Guarde los datos en dos archivos CSV (acabo de cambiar "D", "BC" y "1" en la fórmula B2 antes de volver a guardarlos).

El archivo de informes de Excel también debe contener tres tablas (que se pueden dejar vacías para el ejemplo).

  • Leaversque contiene dos columnas ( Emp Noy Name)
  • NameUpdatesque contiene dos columnas ( Emp Noy Name)
  • IgnoreListque contiene tres columnas ( Emp Noy Start Date) End Date.

También debería haber dos celdas con nombre: StartDatey EndDate. Estas células albergan actualmente17-jun-22y18-jun-21respectivamente.

Las consultas:

fnGetNamedRange - obtiene un valor de un rango con nombre. Utilizado dentroDatos sin procesarconsulta.

let GetNamedRange=(NamedRange) => 
    let
        name = Excel.CurrentWorkbook(){[Name=NamedRange]}[Content],
        value = name{0}[Column1]
    in
        value
in GetNamedRange  

Datos sin procesar Esta consulta combina los dos archivos CSV, elimina los abandonos enumerados en la Leaverstabla, actualiza los nombres mal escritos en la NameUpdatestabla y filtra la tabla según las fechas requeridas.

let
    //Import data from the Reporting Suite.
    Source = Csv.Document(File.Contents(
        "C:\Documents\Raw Data Suite.csv"
        ),[Delimiter=",", Columns=33, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    RemoveTopRows = Table.Skip(Source,0), //Normally 3, set to 0 for example.
    ReportingSuitePromoteHeaders = Table.PromoteHeaders(RemoveTopRows, [PromoteAllScalars=true]),
    ReportingSuiteRemoveOtherColumns = Table.SelectColumns(ReportingSuitePromoteHeaders,{"Date", "Facility", "Status", "Forename", "Surname", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),

    //Import data from the Archive.
    ArchiveSource = Csv.Document(File.Contents(
        "C:\Documents\Archive.csv"
        ),[Delimiter=",", Columns=33, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    ArchivePromoteHeaders = Table.PromoteHeaders(ArchiveSource, [PromoteAllScalars=true]),
    ArchiveRemoveOtherColumns = Table.SelectColumns(ArchivePromoteHeaders,{"Date", "Facility", "Status", "Forename", "Surname", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),

    //Combine the two tables of data.
    CombineTables = Table.Combine({ReportingSuiteRemoveOtherColumns, ArchiveRemoveOtherColumns}),

    //Join Forename & Surname.
    TrimNames = Table.TransformColumns(CombineTables,{{"Forename", Text.Trim, type text}, {"Surname", Text.Trim, type text}}),
    JoinEmployeeName = Table.CombineColumns(TrimNames,{"Forename", "Surname"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Name"),
    ChangeNameCase = Table.TransformColumns(JoinEmployeeName,{{"Name", Text.Proper, type text}}),

    //Reference Name Updates table.
    NameSource = Excel.CurrentWorkbook(){[Name="NameUpdates"]}[Content],
    ChangeDataType1 = Table.TransformColumnTypes(NameSource,{{"Emp No", type text}, {"Name", type text}}),
    MergeTables = Table.NestedJoin(ChangeNameCase, {"Emp No"}, ChangeDataType1, {"Emp No"}, "NameUpdates", JoinKind.LeftOuter),
    ExpandNameUpdates = Table.ExpandTableColumn(MergeTables, "NameUpdates", {"Name"}, {"NameUpdates.Name"}),
    AddNewNameColumn = Table.AddColumn(ExpandNameUpdates, "New Name", each if [NameUpdates.Name] = null then [Name] else [NameUpdates.Name], type text),

    RemoveOtherColumns = Table.SelectColumns(AddNewNameColumn,{"Date", "Facility", "Status", "New Name", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),
    RenameColumn = Table.RenameColumns(RemoveOtherColumns,{{"New Name", "Name"}}),
    ChangeDataType2 = Table.TransformColumnTypes(RenameColumn,
        {{"Date", type date}, {"Facility", type text}, {"Status", type text}, {"Name", type text}, {"Emp No", type text}, {"Contract", type number}, {"Department", type text}, {"Sick", type number}, {"Wkg Patt", type text}}),

    //Filter to required dates and select Perm employees who have had a period of sick.
    FilterDates = Table.SelectRows(ChangeDataType2, each [Date] >= Date.From(fnGetNamedRange("StartDate")) and [Date] <= Date.From(fnGetNamedRange("EndDate"))),
    FilterRows = Table.SelectRows(FilterDates, each ([Status] = "Perm")),

    //Reference the Leavers table.
    LeaversSource = Excel.CurrentWorkbook(){[Name="Leavers"]}[Content],
    RemovedName = Table.SelectColumns(LeaversSource,{"Emp No"}),
    ChangeDataType3 = Table.TransformColumnTypes(RemovedName,{{"Emp No", type text}}),
    RemoveLeavers = Table.NestedJoin(FilterRows, {"Emp No"}, ChangeDataType3, {"Emp No"}, "Leavers", JoinKind.LeftAnti),
    RemoveLeaverColumn = Table.RemoveColumns(RemoveLeavers,{"Leavers"}),
    SortTable = Table.Sort(RemoveLeaverColumn,{{"Emp No", Order.Ascending}, {"Date", Order.Ascending}}),

    //Indicate which days are work days and which aren't.
    GetShiftPattern = Table.AddColumn(SortTable, "Shift Pattern", each Text.Start(Text.From([Wkg Patt], "en-GB"), 7), type text),
    IsShiftPattern = Table.AddColumn(GetShiftPattern, "ShiftPatternCheck", each Value.Is(Value.FromText(Text.Start(Text.From([Wkg Patt], "en-GB"), 7)), type number)),
    IsWorkDay = Table.AddColumn(IsShiftPattern, "WorkDay", each if [ShiftPatternCheck]=true then Text.Middle([Shift Pattern], Date.DayOfWeek([Date], Day.Saturday), 1) else 
                                                                if Date.DayOfWeek([Date],Day.Saturday)<=1 then 0 else 1, type text),
    ChangeDataType4 = Table.TransformColumnTypes(IsWorkDay,{{"WorkDay", Int64.Type}}),
    RemoveShiftPatternColumns = Table.Buffer(Table.RemoveColumns(ChangeDataType4,{"Wkg Patt", "Shift Pattern", "ShiftPatternCheck"})),

    //Combine Departments
    RemoveOtherColumns1 = Table.SelectColumns(RemoveShiftPatternColumns,{"Emp No", "Department"}),
    RemoveDuplicates1 = Table.Distinct(RemoveOtherColumns1),
    GroupDepartments = Table.Group(RemoveDuplicates1, {"Emp No"}, {{"Department", each Text.Combine([Department], ", "), type nullable text}}),
    AddDepartments = Table.NestedJoin(RemoveShiftPatternColumns, {"Emp No"}, GroupDepartments, {"Emp No"}, "Dept", JoinKind.LeftOuter),
    ExpandDepartments = Table.ExpandTableColumn(AddDepartments, "Dept", {"Department"}, {"Department.1"}),
    RemoveOtherColumns2 = Table.SelectColumns(ExpandDepartments,{"Date", "Facility", "Status", "Name", "Emp No", "Contract", "Department.1", "Sick", "WorkDay"}),
    RenameColumns = Table.Buffer(Table.RenameColumns(RemoveOtherColumns2,{{"Department.1", "Department"}}))
    
in
    RenameColumns

Días no laborables- enumera los días libres de cada empleado según los 0 y 1 en la columna Wkg Patt.

let
    Source = #"Raw Data",
    RemoveWorkingDays = Table.SelectRows(Source, each ([WorkDay] = 0)),
    RemoveOtherColumns = Table.Buffer(Table.SelectColumns(RemoveWorkingDays,{"Date", "Emp No"}))
in
    RemoveOtherColumns  

fnDíasdeRed- calcula el número de días laborables excluyendo los días festivos (o días libres en este caso)

let 
    func = (StartDate as date, EndDate as date, optional Holidays as list, optional StartOfWeek as number) =>
        let 
            // optional StartOfWeek, if empty the week will start on Monday
            startOfWeek = if StartOfWeek = null then 1 else StartOfWeek,

            // Providing for logic where EndDate is after StartDate
            Start = List.Min({StartDate, EndDate}),
            End = List.Max({StartDate, EndDate}),
            
            // Switch sign if EndDate is before StartDate
            Sign = if EndDate < StartDate then -1 else 1,
            
            // Get list of dates between Start- and EndDate
            ListOfDates = List.Dates(Start, Number.From(End-Start)+1,#duration(1,0,0,0)),


            // if the optional Holidays parameter is used: Keep only those dates in the list that don't occur in the list of Holidays;
            // otherwise continue with previous table
            DeleteHolidays = if Holidays = null then ListOfDates else List.Difference(ListOfDates, List.Transform(Holidays, Date.From )),

            // Count the number of days (items in the list)
            CountDays = List.Count(DeleteHolidays) * Sign
        in 
            CountDays
in 
    func  

fnDÍAS NO LABORABLES - devuelve las fechas en las que un empleado no está trabajando según la Non-work daysconsulta.

let 
    non_work_dates = (SourceTable as table, StartDate as date, EndDate as date, EmpNo as text) =>
        let 
            Source = SourceTable,
            FilterToEmpNo = Table.SelectRows(Source, each [Emp No] = EmpNo and [Date] >= StartDate and [Date] <= EndDate),
            RemoveEmpNo = Table.SelectColumns(FilterToEmpNo,{"Date"}),
            ToList = RemoveEmpNo[Date]
        in 
            ToList
in
    non_work_dates

Períodos de enfermedad- identifica las fechas de inicio y finalización de cada período de enfermedad

let
    Source = #"Raw Data",
    FilterToSick = Table.SelectRows(Source, each ([Sick] <> null)),

    //The data is sorted by Employee Number and Date so periods of sickness per employee appear next to each other.
    //Add three indexes so data from the previous and next row may be referenced in the current row.
    //Index1 starts at 1.  
    AddIndex1 = Table.AddIndexColumn(FilterToSick, "Index1", 0, 1, Int64.Type),
    Index1PK = Table.AddKey(AddIndex1, {"Index1"}, true),
    SortByIndex1 = Table.Sort(Index1PK, {"Index1", Order.Ascending}),

    //We're only adding the Date & Emp No from the previous & next row to the current row.
    //So a table is created containing only those two fields.
    //Add Index2 to the JoinTable.  This will start at 0
    JoinTable = Table.SelectColumns(Index1PK,{"Date", "Emp No"}),
    AddIndex2 = Table.AddIndexColumn(JoinTable, "Index2", 1, 1, Int64.Type),
    Index2PK = Table.AddKey(AddIndex2, {"Index2"}, true),
    SortByIndex2PK = Table.Sort(Index2PK, {"Index2", Order.Ascending}),

    //Add Index3 to the JoinTable.  This will start at 2
    AddIndex3 = Table.AddIndexColumn(JoinTable, "Index3", -1, 1, Int64.Type),
    Index3PK = Table.AddKey(AddIndex3, {"Index3"}, true),
    SortByIndex3PK = Table.Sort(Index3PK, {"Index3", Order.Ascending}),

    //Add the Employee Number and Date from the next row to the current row.
    Index2Merge = Table.Join(Table.PrefixColumns(SortByIndex2PK, "Prev"), {"Prev.Index2"}, Table.PrefixColumns(SortByIndex1,"Current"), {"Current.Index1"}, JoinAlgorithm.SortMerge),
    SortByDate1 = Table.Sort(Index2Merge,{{"Current.Emp No", Order.Ascending}, {"Current.Date", Order.Ascending}}),

    Index3Merge = Table.Join(Table.PrefixColumns(SortByIndex3PK,"Next"), {"Next.Index3"}, SortByDate1,  {"Current.Index1"}, JoinAlgorithm.SortMerge),
    SortByDate1Again = Table.Sort(Index3Merge,{{"Current.Emp No", Order.Ascending}, {"Current.Date", Order.Ascending}}),

    DaysSincePreviousSick = Table.AddColumn(SortByDate1Again, "Days Since Previous", each try fnNetworkDays([Prev.Date],[Current.Date],fnNONWORKDAYS(#"Non-work days",[Prev.Date], [Current.Date], [Current.Emp No]))-1 otherwise null),
    DaysToNextSick = Table.AddColumn(DaysSincePreviousSick, "Days To Next", each try fnNetworkDays([Current.Date],[Next.Date],fnNONWORKDAYS(#"Non-work days",[Current.Date], [Next.Date], [Current.Emp No]))-1 otherwise null),

    IdentifyStart = Table.AddColumn(DaysToNextSick, "Start", each if [Days Since Previous]<>1 or [Prev.Date]=null or [Current.Emp No]<>[Prev.Emp No] then "Start" else null),
    IdentifyEnd = Table.AddColumn(IdentifyStart, "End", each if [Days To Next]<>1 or [Current.Emp No]<>[Next.Emp No] then "End" else null),
    RemoveOtherColumns = Table.SelectColumns(IdentifyEnd,{"Current.Date", "Current.Facility", "Current.Status", "Current.Name", "Current.Emp No", "Current.Contract", "Current.Department", "Current.Sick", "Start", "End"}),
    RenameColumns= Table.RenameColumns(RemoveOtherColumns,{{"Current.Date", "Date"}, {"Current.Facility", "Facility"}, {"Current.Status", "Status"}, {"Current.Name", "Name"}, 
                                                           {"Current.Emp No", "Emp No"}, {"Current.Contract", "Contract"}, {"Current.Department", "Department"}, {"Current.Sick", "Sick"}})
in 
    RenameColumns  

fnSumEntre - suma las horas de enfermedad entre dos fechas para un número de empleado específico.

let
    Sum_Between = (SourceTable as table, StartDate as date, EndDate as date, EmpNo as text) =>
        let  
            Source = SourceTable,
            FilterToData = Table.SelectRows(Source, each [Emp No] = EmpNo and [Date] >= StartDate and [Date] <= EndDate),
            GroupRows = Table.Group(FilterToData, {"Emp No"}, {{"Sick Hours", each List.Sum([Sick]), type nullable number}})
        in  
            GroupRows
in
    Sum_Between

Periodos totales de enfermedad - devuelve una sola fila para cada período de enfermedad junto con el número de días de enfermedad.

let
    Source = #"Periods of Sick",
    MergeStartAndEnd = Table.CombineColumns(Source,{"Start", "End"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"SickPeriods"),
    
    //Add multi-day sick.
    FilterRows1 = Table.SelectRows(MergeStartAndEnd, each ([SickPeriods] = "End" or [SickPeriods] = "Start")),
    AddIndex1 = Table.AddIndexColumn(FilterRows1, "Index1", 0, 1, Int64.Type),
    AddIndex2 = Table.AddIndexColumn(AddIndex1, "Index2", 1, 1, Int64.Type),

    MergeIndexes = Table.Join(AddIndex2, "Index1", Table.PrefixColumns(AddIndex2,"1"), "1.Index2", JoinAlgorithm.SortMerge),
    FilterRows2 = Table.SelectRows(MergeIndexes, each ([SickPeriods] = "End")),
    RemoveOtherColumns1 = Table.SelectColumns(FilterRows2,
            {"1.Date", "Date", "Facility", "Status", "Name", "Emp No", "Contract", "Department"}),
    RenameColumns1 = Table.RenameColumns(RemoveOtherColumns1,{{"1.Date", "From"}, {"Date", "To"}}),

    AddSickHours = Table.AddColumn(RenameColumns1, "Sick", each fnSumBetween(#"Periods of Sick",[From],[To],[Emp No])),
    ExpandTable = Table.ExpandTableColumn(AddSickHours, "Sick", {"Sick Hours"}, {"Sick Hours"}),

    //Add single day sick periods.
    FilterRows3 = Table.SelectRows(MergeStartAndEnd, each ([SickPeriods] = "StartEnd")),
    DuplicateDateColumn = Table.DuplicateColumn(FilterRows3, "Date", "To"),
    RemoveOtherColumns2 = Table.SelectColumns(DuplicateDateColumn,{"Date", "To", "Facility", "Status", "Name", "Emp No", "Contract", "Department", "Sick"}),
    RenameColumns2 = Table.RenameColumns(RemoveOtherColumns2,{{"Date", "From"}, {"Sick", "Sick Hours"}}),
    
    //Join single and multi day sick periods.
    AppendData = Table.Combine({ExpandTable, RenameColumns2}),
    SortRows = Table.Sort(AppendData,{{"Emp No", Order.Ascending}, {"From", Order.Ascending}}),
    CalculateDaysSick = Table.AddColumn(SortRows, "Days Sick", each Number.Round([Sick Hours]/([Contract]/5),2))
in
    CalculateDaysSick  

Puntuación de Bradford - devuelve el número total de días y ocurrencias de enfermedad junto con elPuntuación de Bradford

let
    Source = #"Total Periods of Sick",
    GroupRows = Table.Group(Source, {"Emp No", "Name", "Department", "Facility"}, {{"Total Days", each List.Sum([Days Sick]), type number}, {"Occurrences", each Table.RowCount(_), Int64.Type}}),
    AddBradfordScore = Table.AddColumn(GroupRows, "Bradford Score", each Number.Power([Occurrences],2)*[Total Days])
in
    AddBradfordScore

Respuesta1

Estás utilizando la herramienta equivocada, como si intentaras utilizar un martillo de orfebre para clavar un clavo de ferrocarril. Haymuchosbases de datos relacionales disponibles, incluidas las gratuitasLibreOffice Base, ymuchos otros. ParaBase, y muchas otras bases de datos, laLos archivos CSV se pueden importar a una nueva base de datoscon poco esfuerzo.

Puede que tarde un poco enaprender los conceptos básicos de un lenguaje de consulta de base de datos, como una versión de SQL, pero facilitará su tareamucho más simpley más alcanzable.

información relacionada